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Introduction 


This  volume  had  its  origin  in  a  ‘  Microsymposium  ’ 
held  on  22  August  1981,  in  the  course  of  the  Twelfth 
International  Congress  of  the  International  Union 
of  Crystallography.  Certain  authors  were  invited  to 
review  specific  fields  within  the  general  area  of  crystal¬ 
lographic  statistics,  and  the  rest  of  the  time  allotted  to 
the  symposium  was  filled  with  selected  contributed 
papers.  Additional  contributed  papers  were  presented 
as  posters  at  other  times  during  the  Congress,  and 
papers  of  all  three  types  were  discussed  at  an  ad-hoc 
session  following  the  symposium.  The  Indian 
Academy  of  Sciences,  through  Professor  S.  Rama- 
seshan,  has  provided  facilities  for  publication,  and 
authors  have  provided  manuscripts  and  read  proofs 
within  a  rather  tight  time  schedule.  I  and  the 
co-chairman  of  the  symposium,  Professor  Mary 
F.  Richardson,  are  greatly  indebted  to  Professor 
Ramaseshan  and  the  authors. 

Crystallographic  statistics,  in  the  sense  intended  in 
the  title  of  this  book,  began  almost  by  accident.  In 
1942,  Professor  S.  H.  Yu,  happily  able  to  be  present 
at  the  Congress,  submitted  to  Nature  a  paper  on  the 
determination  of  absolute  from  relative  X-ray  intensi¬ 
ties.  The  Editors  of  Nature  sent  the  paper  to  the 
Cavendish  Laboratory  for  an  opinion  on  its  merit. 
The  method  was  complicated  and  depended  on  the 
use  of  a  set  of  tables  not  available  in  the  United 
Kingdom,  but  Henry  Lipson  and  I  recommended  its 
publication  (Yu,  1942).  The  proposal  set  us  arguing 
over  a  practicable  method  of  achieving  the  same  pur¬ 
pose,  and  the  idea  gradually  emerged  that  the  general 
C.S.— l 
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level  of  the  intensities  of  the  various  reflexions  from  a 
crystal  must  depend  on  the  content  at  the  unit  cell 
and  not  on  the  details  of  the  atomic  arrangement. 
Lipson  (unpublished)  suggested  calculating  the  struc¬ 
ture  factors  for  an  arbitrary  arrangement  of  the  atoms 
in  the  unit  cell  and  comparing  the  average  calculated 
value  with  the  average  observed  value  for  suitable 
groups  of  reflexions,  but  I  wanted  a  tidier  approach. 

I  had  some  elementary  statistics  in  mind  in  connexion 
with  diffraction  by  disordered  structures  like  cobalt 
and  the  copper-gold  alloy  AuCu3,  and  it  soon  became 
evident  to  me  that  the  appropriate  statistical  variables 
to  use  were  the  X-ray  intensities,  not  the  structure 
factors.  A  very  short  calculation  showed  that  the  mean 
value  of  the  intensity  expressed  in  units  of  (electrons)2 
is  equal  to  the  sum  of  the  squares  of  the  scattering 
factors  of  all  the  atoms  in  the  unit  cell.  Once  obtained, 
this  relation  is  practically  obvious  from  conservation 
of  energy,  but  it  is,  so  far  as  I  know,  the  first  published 
result  in  the  field  now  known  as  crystallographic 
statistics.  My  letter  to  Nature  (Wilson,  1942),  in  effect 
a  referee’s  report,  has  since  become  my  most  cited 
publication  (Garfield,  1974,  1976),  but  it  attracted  no 
notice  at  the  time,  for  very  understandable  reasons. 

The  subject  remained  quiescent  until  1948,  when 
Harker  (1948)  and  Hughes  independently  rediscovered 
the  main  result  of  my  1942  paper.  Hughes  (1949)  went 
further,  and  showed  empirically  that  the  distribution 
of  the  magnitudes  of  the  structure  factors  was  approxi¬ 
mately  normal.  He  gave  four  examples,  all  of  centro- 
symmetric  crystals,  and  it  is  not  clear  whether  he 
realized  that  non-centrosymmetric  crystals  would  have 
a  different  distribution  of  structure  factors.  Wilson 
(1949),  using  the  central-limit  theorem,  derived  the 
ideal  distribution  functions  for  both  centrosymmetric 
and  non-centrosymmetric  crystals.  The  derivation 
rested  on  the  explicit  assumptions  that  the  unit  cell 
contained  a  sufficiently  large  number  of  atoms,  that 
no  one  atom  or  small  group  of  atoms  dominated  the 
scattering,  and  that  the  effect  of  other  symmetry 
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elements,  except  centring,  could  be  neglected.  There 
was  also  an  implicit  assumption  of  negligible  disper¬ 
sion.  This  time  the  results  were  quickly  taken  up  by 
other  workers,  both  on  the  purely  statistical  side,  and 
as  the  basis  of  direct  methods  of  structure  determina¬ 
tion.  The  current  President  of  the  International  Union 
of  Crystallography,  Professor  Jerome  Karle,  and  his 
co-worker  Professor  Herbert  Hauptman,  author  of  the 
introductory  paper  of  the  Microsymposium  and  of 
this  volume,  were  the  pioneers  in  the  development  of 
direct  methods  of  an  overtly  statistical  nature  (Karle 
&  Hauptman,  1953;  Hauptman  &  Karle,  1953);  there 
was,  of  course,  a  parallel  development  of  non-statisti- 
cal  or  covertly  statistical  direct  methods,  such  as 
those  of  Sayre  (1952)  and  Cochran  (1952).  A  landmark 
in  crystallographic  statistics  was  the  publication  of  the 
monograph  by  Srinivasan  and  Parthasarathy  (1976). 
This,  and  such  books  as  Giacovazzo  (1980)  on  direct 
methods,  may  be  consulted  for  the  history  of  the 
growth  of  the  subjects. 

The  present  volume,  like  the  symposium  out  of 
which  it  arose,  was  planned  to  concentrate  on  crys¬ 
tallographic  statistics  as  such,  and  makes  no  attempt 
to  include  methods  of  structure  determination,  though 
it  may  be  noted  in  passing  that  they  were  not  neglected 
at  the  Ottawa  Congress.  It  was  intended  to  include 
papers  on  the  following  themes,  in  addition  to 
Professor  Hauptman’s  general  introduction:  Bayesian 
statistics,  intensity  statistics,  statistics  of  recorded 
counts,  alternatives  to  least  squares,  and  Wiener 
methods  for  electron  density.  Ready  consent  was 
obtained  from  the  speakers  invited  for  four  of  these 
topics,  but  none  of  those  invited  to  review  alternatives 
to  least  squares  was  able  to  accept.  Some  contributed 
papers  touching  on  the  subject  via  altered  weights  are 
included  after  an  editorial  background  note. 

The  full  title  of  the  symposium  was  ‘Progress  and 
Problems  in  Crystallographic  Statistics’.  The  authors 
both  of  the  invited  and  of  the  contributed  papers 
have  naturally  concentrated  on  the  ‘progress’,  so  the 
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balance  has  been  redressed  by  editorial  mention  of 
some  of  the  ‘problems’:  bias  in  the  estimation  of 
parameters,  in  the  note  on  alternatives  to  least  squares; 
and  doubts  about  the  effect  of  correlation  of  atomic 
positions  on  the  expressions  for  the  probability 
distribution  of  intensities  (p.  175  below).  There  is  a 
further  problem  about  intensity  statistics,  perhaps  of 
academic  interest  only:  the  functional  form  of  the 
distribution  for  really  large  intensities  is  as  yet 
unknown  (Wilson,  1980). 


A.  J.  C.  Wilson 
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Crystallographic  Statistics — General  Review 

By  Herbert  Hauptman 

Medical  Foundation  of  Buffalo,  Inc.,  73  High  Street, 
Buffalo,  NY  14203,  USA 

Abstract 

The  applications  of  statistical  methods  in  crystallo¬ 
graphy  fall  into  two  major  classes.  The  first  is 
concerned  with  the  study  of  the  statistical  properties 
of  the  intensities  of  X-rays  diffracted  by  a  crystal ;  the 
second  with  those  of  groups  of  related  intensities.  The 
latter  is  the  basis  for  the  analysis  of  the  phase  problem 
of  X-ray  crystallography  by  probabilistic  methods, 
and  an  extensive  literature  describing  these  methods 
exists.  However,  this  work  is  outside  the  scope  of  the 
present  paper,  which  is  concerned  only  with  an  over¬ 
view  of  the  statistical  properties  of  the  intensities 
of  X-rays  scattered  by  a  crystal. 

1.  The  basic  Wilson  distributions 

Possibly  the  best  place  to  start  is  with  the  very  first 
applications  of  statistical  methods  in  crystallography 
made  by  Wilson  in  1949.  Not  only  does  this  work 
illustrate  in  the  clearest  possible  way  the  ideas  of 
random  variable  and  the  probability  distribution  of 
a  random  variable,  but  it  also  makes  an  easily  under¬ 
stood  and  important  application  of  these  ideas. 

Imagine  that  a  crystal  structure  in  the  space  group 
PI  is  given,  that  is  to  say  a  fixed  set  of  atomic  position 
vectors  rl5  r2,  ...,  is  specified.  Then  the  equation 


N 


(1) 
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where 


N 


°2=  E.  fj> 


(2) 


and  fj  is  the  scattering  factor  of  the  atom  labelled  j, 
defines  the  normalized  structure  factor  isH  as  a  function 
of  the  reciprocal  lattice  vector  H.  Clearly  En  is  a 
complex-valued  function  of  H.  By  means  of 


N 


(3) 


7=1 


one  obtains  a  real-valued  (in  fact  non-negative-valued) 
function  of  H. 

For  each  crystal  structure  (3)  defines  a  function 
of  the  reciprocal  lattice  vector  H  so  that  there  exists 
in  fact  an  infinity  of  such  functions.  Thus  (3) 
presents  us  with  a  class  of  functions  of  infinite  and 
bewildering  variety.  How  are  we  to  make  any  sense 
of,  or  bring  any  order  into,  this  hopelessly  complex 
family  of  functions?  In  fact,  can  any  general 
statement  whatsoever  be  made  about  this  large 
and  varied  class  of  functions? 

In  order  to  answer  these  questions  we  associate 
with  each  equation  (3)  another  function  which 
defines  the  distribution  of  values  of  the  function 
I  EU  I*  To  clarify  the  notion  of  distribution  of  values, 
we  ask,  for  example,  what  fraction  of  the  values  of 
I^Hl  lies  in  the  interval  0  0  to  01,  or  in  the  interval 
0T  to  0-2,  or,  more  generally  in  any  interval  ( a ,  b) 
where  0^a<b!  The  answer  is  given  by  the  remark¬ 
ably  simple  expression 


exp  (—a2)  —  exp  (—b2), 


(4) 


or,  as  the  mathematicians  prefer,  by 


b 


j  P(R)  d R, 


a 


(5) 
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where 


P  (R)  =  2  R  exp  (-i?2)  if  R  ^  0,) 

f  (6) 

P(R)  =  0  ifi?<0.) 

The  graph  of  P  (R)  is  shown  in  Fig.  1.  Thus,  the  non¬ 
negative  valued  function,  P  ( R ),  of  the  real  variable  R 
defines  the  distribution  of  values  of  the  function 
I  I  [equation  (3)] . 

In  this  way  we  have  arrived  at  the  notion  of  a 
random  variable  (here  the  magnitude  of  a  structure 
factor,  I^H  | )  and  the  probability  distribution  of  a 
random  variable  [P  (R)  in  this  case].  Now  it  is  a 
remarkable  fact  that,  under  rather  mild  restrictions, 
P  (R)  is  independent  of  the  crystal  structure  [although 
the  function  |£hI>  equation  (3),  clearly  is  not] .  We 
say  also  that  the  probability  that  the  random  variable 
|  isH  |  lie  in  the  interval  (a,  b )  is  given  by  (4)  [or  (5)]. 

Fig.  1  clearly  shows  that  the  values  of  |  *h\  tend 
to  cluster  around  0-7,  weak  or  absent  reflections  are 
very  rare,  and  values  of  [  I  in  excess  say  of  3  are 
also  rare.  This  qualitative  statement  can  be  made 
more  precise,  as  shown  next. 

With  the  function  P  (R)  in  our  possession  it  is  a 
relatively  simple  matter,  at  least  in  principle,  to  answer 
certain  questions  concerned  with  the  distribution  of 
values  of  the  function  I^hI-  For  example,  the  average 
value  of  I^hI2  and  the  variance  of  I^h!2  are  easily 


R 

Fig.  1 .  The  probability  distribution  P  (R),  equation  (6)  of  the 
magnitude  of  a  structure  factor  |  j  |,  in  PI. 
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found  (at  least  in  this  case)  by  means  of  the  formulas 

i  £h  r  = 

=  2 R3  exp  (-R2)  d  /?  —  I ,  (7) 

Var(|£„|!)  =  <(|£Hp-|^!> 

=  rjR°-iyP(R)dR  =  l.  (8) 

If  the  space  group  is  PI  then  P  (R)  is  Gaussian 
(Fig.  2): 

P(R)=  fv^/~exp  (—  |  R2)  if  P  >  0, ")  (9) 

P  (R)  =  0  if  R  <  0,  ) 

which  should  be  compared  with  (6).  Fig.  2  shows,  in 
sharp  contrast  to  Fig.  1  for  space  group  PI,  that  weak 
or  absent  reflections  are  now  relatively  numerous  and 
that  very  strong  reflections,  while  rare,  occur  with 
greater  frequency  in  PI  than  in  PI.  Although,  as  it 
turns  out,  the  average  value  of  |  En?  is  unity  for  both 
space  groups,  even  a  superficial  inspection  of  Figs.  1 
and  2  shows  that  the  dispersion  of  values  of  |  EH  I 
about  its  mean  is  greater  for  PI  than  for  PI.  The 
quantitative  result  is  given  next. 

The  further  comparison  of  (6)  and  (9)  suggests  in 
fact  a  method  for  distinguishing  between  PI  and  PI 
by  a  purely  statistical  analysis  of  the  distribution  of 
values  of  the  magnitudes  of  the  observed  structure 


R 

Fig.  2.  The  probability  distribution  P(R),  equation  (9),  of  the 
magnitude  of  a  structure  factor  |Pjj|,  in  PI. 
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factors.  For  example,  the  mean  and  variance  are  now 
given  by 


|£hi*=.CV 


^  R2  exp  (~%R2)dR  =  1,  (10) 


77 


|2 


exp  (— |  F2)  dR  =  2,  (11) 


so  that  the  comparison  of  (8)  and  (11)  serves  to 
distinguish  the  two  space  groups.  As  anticipated,  the 
variance,  a  measure  of  dispersion  about  the  mean, 
is  in  fact  greater  for  FI  than  for  FI. 

By  means  of  a  more  detailed  study  of  the  effects  of 
the  various  symmetry  elements  on  these  distributions, 
together  with  the  observation  of  systematic  absences, 
one  arrives  at  a  method  for  determining  the  space 
group  in  which  the  statistical  analysis  of  the  observed 
intensities  plays  the  major  role. 

It  should  be  noted  that,  although  the  probability 
distributions  F(F)  f(6)  and  (9)]  describe  the  distribution 
of  values  of  the  magnitude  of  a  structure  factor,  a 
simple  change  of  variable  enables  one  to  replace  these 
distributions  by  new  ones  describing  the  distribution 
of  values  of  the  intensity  /  of  a  reflection. 

2.  Estimating  the  values  of  weak  or  unobserved 

reflections 

Expressions  for  the  probability  pt  (/0  [  /)  that  a 
reflection  of  true  intensity  /  will  have  an  observed  value 
I0  (possibly  negative)  have  been  found  for  different 
counting  modes,  specifically  for  fixed-time  counting 
( i.e .  the  number  of  counts  per  specified  length  of  time 
interval),  for  fixed-count  timing  (i.e.  the  counting  rate 
for  specified  number  of  counts)  and  for  variations  of 
these  (Wilson,  1980).  These  distributions  serve  as  the 
basis  for  a  proper  statistical  method  for  estimating 
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the  values  and  standard  deviations  of  the  intensities 
of  measured  reflections,  particularly  in  the  important 
case  that  the  intensities  are  weak  or  measured  to  be 
negative. 

The  integrated  intensity  of  a  reflection  is  obtained 
as  the  difference  between  a  counting  rate  averaged 
over  a  region  of  reciprocal  space  which  includes  the 
reflected  intensity  and  that  averaged  over  a  nearby 
region  which  excludes  the  reflected  intensity.  Owing 
to  statistical  fluctuations  in  the  counting  rates,  a 
measured  intensity  may  be  negative,  although  the 
true  intensity  must  of  course  be  non-negative.  Can 
one  obtain  an  improved  estimate  of  the  true  intensity 
and  its  standard  deviation  by  taking  into  account  its 
known  a  priori  probability  distribution? 

The  Bayesian  approach  to  this  problem  (French  & 
Wilson,  1978)  interprets  probability  distributions  as 
degrees  of  belief  in  the  possible  values  of  the  intensity 
rather  than  the  distribution  of  values  of  the  intensity. 
Prior  to  measuring  an  intensity  we  have  a  certain 
distribution  of  belief  in  its  possible  values  and  this 
distribution  is  changed  in  a  known  way  after  the 
measurement  is  made.  Thus,  by  calculating  the  a 
posteriori  expectation  value  and  variance  one  obtains 
an  improved  estimate  of  the  true  intensity  and  its 
variance  from  the  measured  values  which  is  of  parti¬ 
cular  importanace  in  the  case  that  the  measured 
intensity  is  weak  or  negative. 

More  specifically,  if  one  denotes  by  p(I )  the  proba¬ 
bility  distribution  of  the  intensity  of  a  reflection 
[derived  e.g.,  from  (6)  or  (9)],  by  1 7)  the  (a  priori ) 
conditional  probability  distribution  of  the  observed 
intensity  70  of  a  reflection,  given  that  its  true  value 
is  /,  and  by  pf(j\  70)  the  ( a  posteriori)  conditional 
probability  distribution  of  the  intensity  I  of  a  reflec¬ 
tion,  given  that  the  observation  70  has  been  made, 
then  Bayes’  theorem  (see,  e.g.,  pages  108-114  of 
Feller,  1960)  states  that 


p,(i\i,)  =  KPl(ia\r)p(T), 


(12) 
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where  K  is  a  suitable  scaling  parameter.  From  (12)  one 
may  calculate  the  a  posteriori  expected  value  and 
variance  of  the  intensity  of  a  reflection  after  the 
observation  70  (possibly  negative)  has  been  made. 

It  is  important  to  derive  the  best  estimates  possible 
for  observed  intensities  and  their  reliability,  particular¬ 
ly  when  these  are  weak  or  unobserved,  for  several 
reasons.  First,  in  this  way  one  reduces  or  eliminates 
the  bias  in  the  resulting  structural  parameters.  Next, 
improved  estimates  of  the  structure  factor  moduli  and 
their  associated  errors  are  also  needed,  for  example, 
in  the  calculation  of  Fourier  series  or  difference 
syntheses.  Again,  if  a  large  number  of  reflections  are 
unobserved  because  they  are  weak,  the  procedure  may 
serve  an  important  function  in  the  determination  of 
the  crystal  structure  by  direct  methods  which  usually 
require  a  great  over-redundancy  of  data  for  success 
and  in  which  the  weak  intensities  play  an  important 
role.  Finally,  in  view  of  recent  developments  in 
integrating  the  techniques  of  direct  methods  with 
isomorphous  replacement  and  anomalous  dispersion, 
it  is  likely  that  the  reflections  of  weak  or  moderate 
intensity  will  play  an  increasingly  important  role  in 
the  solution  of  the  phase  problem  and  in  structure 
determination,  particularly  in  the  macromolecular 
case  (Hauptman,  1981a,  b). 


3.  Structural  isomorphism 


For  an  isomorphous  pair  of  structures  normalized 
structure  factors  En  and  Gh  are  defined  by 

N 
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N 


^  gj  exp  (27 r  /  H  •  r,), 
7=1 


AT 


-I* 

7=1 


(14) 


We  seek  the  joint  probability  distribution  of  the  pair 
of  magnitudes  )  isH  |,  |  (7H  |,  i.e.  the  function  P  ( R ,  S ) 
which  defines  the  distribution  of  values  of  the  ordered 
pair  of  non-negative  real  numbers  ( |  EH  I  ’  I  I  )» 
much  the  same  way  that  P(R )  defines  the  distribution 
of  values  of  I^H  I  alone.  Thus  the  fraction  of  pairs 
(I^hI’  Igh  I )  for  which  |  eh\  lies  in  the  interval  (a,  b ) 
and  |  (JH  |  lies  in  the  interval  (c,  d)  is  given  by  the 
double  integral. 

b  d 

J  /  P(R,S)dRdS,  (15) 

R—a  S=c 


where,  as  it  turns  out,  the  function  P  ( R ,  S )  is  defined  by 


P(R,  S)  = 


4  RS 

1  —  a2 


exp  j  - 


R2  +  S2' 

1  —  a2  : 


I0  R?j  if  R  >  0  and  5  ^  0,  * 

p  (R,  S)  =  0  if  R  <  0  or  5  <  0, 

IQ  is  the  Modified  Bessel  Function  (Fig.  3),  and 

N 

(Z  fl  &)2 

O  J 


(16) 


a“  =  - 


N  N 

(  2  P)(2  z2) 

7=1  J  /  7=  IV 


(17) 


P  ( R ,  S )  is  said  to  be  the  joint  probability  distribution 
of  the  pair  of  random  variables  ( |  Eh\’  I  Gh\)- 
The  graph  of  the  Bessel  function  70  (x)  shows  that, 
rather  like  cosh  x,  this  function  grows  at  approxi¬ 
mately  an  exponential  rate  as  x  tends  toward  ±00. 
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Fig.  3.  The  modified  Bessel  function,  I0(x). 


Nevertheless,  owing  to  the  presence  of  the  descending 
exponential  factor  in  (16),  P  ( R ,  S),  which  is  equal  to 
zero  when  R  =  0  or  when  S  =  0,  again  tends  toward 
zero  with  increasing  R  or  S,  but  is  of  course  positive 
for  intermediate  values  of  R  and  S. 

Equation  (16)  enables  one  to  calculate  the  corre¬ 
lation  coefficient  r  of  the  pair  (|  EuV,  IghI2)- 
(See  Watson,  1958,  for  the  needed  integral  formulas.) 

Cov(|£h|2,  |Gh|2) 

{Var  ( I  £h  |*)>'«  {Var  (|  G„  (‘8) 


<(|£, 


H 


2  _ 


£hI2)(|chI2 


ghI2)> 


H 


r— 


<(  I  £h  I ' *  —  I  ■ eh  I Y  >fia  <(|  0H  |2-  ’ 

(19) 


I  %  I  2  =  I  °H  I1  =  Jo"  Jo"  S‘WMS= Jo  fo 

S2R(R,S)dRdS=l,  (20) 

Cov  (|  Eh  I2,  |  Gh  j2)  =  J®  J“  (R>- 1)  (S2-l)P(R,S) 

dRdS  =  «2,  (21) 
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Var(|£„|*)=Var(|GH|!)=  J”  J"  (JP  -  iyP(R,S) 

dRdS=l  (22) 

and,  finally,  from  (18), 

r  =  a2.  (23) 

Inspection  of  (17)  and  (23)  shows  that  in  the  extreme 
case  that  fj  —  gj  for  every  j  then  |  i?H  |2  —  |  GH  |2  for 
every  H,  and  the  correlation  coefficient  of  the  pair 
(|%l2>  I  gh!  2)  is  unity,  as  expected.  In  general,  how¬ 
ever,  r  is  positive  and  less  than  unity.  Clearly,  in  the 
case  of  perfect  isomorphism,  r  is  a  positive  constant  as 
a  function  of  sin0/A.  In  the  case  of  imperfect  isomor¬ 
phism  on  the  other  hand  ,r  is  a  monotonically  decreas¬ 
ing  function  of  sin0/A.  Thus  P(R,S )  leads  to  a  method 
for  determining  the  degree  of  isomorphism  between 
two  structures.  In  fact  r,  as  a  function  of  sin0/A,  may 
be  taken  as  a  measure  of  the  degree  of  isomorphism 
of  the  two  structures,  i.e.  the  degree  to  which  the  two 
structures  coincide.  Application  to  protein  crystal¬ 
lography,  for  which  the  isomorphous  replacement 
technique  is  the  most  important  tool,  is  clear. 

Finally,  the  case  that  the  G-structure  is  a  trial,  or 
model  structure,  possibly  incomplete,  is  also  included. 
In  this  case  a  more  detailed  analysis  leads  to  the  joint 
probability  distribution  P(R,S )  dependent  on  the 
parameter  (  |  Ar  |  ),  the  average  error  of  the  trial 
structure.  Then  the  average  error  is  expressible  in 
terms  of  the  correlation  coefficient  r,  although  the 
normalized  discrepancy  index 


<1 

£hI  1 

^hI  >h 

<I£h 

I)h 

is  more  commonly  used.  Clearly  then  this  distribution 
plays  an  important  role  in  the  process  of  refinement  of 
crystal  structures.  (See  Srinivasan  &  Parthasarathy, 
1976,  for  further  details.) 
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4.  Weighting 

4.1.  Least-squares  refinement 

The  least-squares  refinement  process  of  crystal 
structures  consists  simply  of  the  solution  by  least- 
squares  of  the  highly  redundant  structure  factor 
equations.  In  order  to  minimize  systematic  error  in 
the  determination  of  the  structural  parameters  it  is 
essential  to  weight  these  equations  correctly  and,  to 
this  end,  estimates  of  the  uncertainties  in  the  observed 
intensities,  along  the  lines  briefly  described  earlier, 
are  necessary. 

4.2.  Fourier  syntheses 

The  electron  density  function  p  is  represented  by  a 
Fourier  series  the  coefficients  of  which  are  the  structure 
factors 


F=  |  F  |  exp  (i<f>).  (25) 

In  practice  the  magnitudes  |F|  are  obtained  from 
experiment  and  are  therefore  subject  to  error  the 
magnitudes  of  which  may  be  estimated  as  described 
earlier,  e.g.  by  taking  into  account  fluctuations  due 
to  counting  statistics,  instrument  instability,  and  in¬ 
adequate  correction  factors  and  then  employing  the 
Bayesian  approach  ( via  the  a  posteriori  probability 
distribution)  to  estimate  the  standard  deviation.  The 
phases  cf>  are  derived  from  the  observed  magnitudes 
|F|  and  are  therefore  subject  to  additional  errors  the 
distribution  of  which  depends  on  the  nature  of  the 
phase  determination  process.  Thus  the  structure  factor 
Fis  a  random  variable,  and  its  probability  distribution 
determines  how  the  Fourier  coefficients  are  to  be 
weighted  in  order  to  yield,  in  some  sense,  the  ‘best’ 
electron  density  function  p.  Similar  remarks  apply, 
for  example,  to  the  use  of  the  Patterson  function  to 
locate  heavy  atom  positions  employing  data  from  a 
pair  of  isomorphous  crystals. 
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No  matter  how  the  phases  are  determined  they  are 
subject  to  error.  In  the  heavy  atom  method,  for 
example,  phases  calculated  from  the  heavy  atom 
positions  are  clearly  only  an  approximation  to  the 
true  phases.  Phases  obtained  by  single  or  multiple 
isomorphous  replacement  are  subject  to  error  because 
of  experimental  error  in  the  observed  intensities  as 


well  as  imperfect  isomorphism.  Phases  determined 
by  the  method  of  anomalous  dispersion  or  direct 
methods  are  subject  to  similar  errors. 

Since 

Ftrue  7^  Fobs* 

(26) 

it  follows  that 

Frue  7^  F)bs- 

(27) 

Define 

AP  =  Ptrue  f’obs 

(28) 

and  the  best  Fourier  that  one  which  minimizes 

J  (AP)2  dV. 

(29) 

V 


This  definition  of  the  best  Fourier  then  serves  as  the 
basis  for  the  derivation  of  a  suitable  weighting  function 
W.  (See,  e.g.  Srinivasan  &  Parthasarathy,  1976,  for 
further  details.) 


4.3.  Tangent  formula 


The  most  widely  used  formula  for  expanding  and 
refining  a  basis  set  of  phases  is  the  tangent  formula: 

<  I  £h£h-k I sin  (^k  +  ^h-k^k  t  .... 

tan  =  — - ; - -  =  -,  (30) 

(  !  ^K-^H-K  I cos  ^H-k)  )j£  B 

where  sin  <£H  has  the  same  sign  as  T  and  cos  has 


the  same  sign  as  B.  Although  it  may  not  be  the 
best  technique  for  this  purpose,  it  does  give  good 
results,  in  general,  and  is  most  efficient.  The  stability 
of  the  formula  is  improved  if  the  averages  are  weighted 
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by  means  of  an  increasing  function  of  T 2  +  B1,  the 
rationale  for  this  procedure  depending  on  a  rather 
detailed  study  of  the  derivation,  via  probabilistic 
techniques,  of  the  tangent  formula.  The  formula,  with 
suitable  weights,  has  assumed  greater  importance  in 
recent  months  with  the  unexpected  discovery  by  Yao 
Jia-Xing  (1981)  that  a  randomly  chosen  basis  set  of 
phases  will  surprisingly  often  converge  to  the  correct 
answer,  so  that  the  tangent  formula  alone,  properly 
weighted,  becomes  an  important  tool  for  structure 
determination. 

This  research  was  supported  by  National  Science 
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Abstract 

We  describe  the  basic  principles  of  the  Bayesian 
approach  to  statistical  analysis.  We  show  how  it  leads 
to  sensible  estimates  of  structure  factor  moduli  from 
intensity  observations,  whether  the  latter  are  positive 
or  negative.  For  diffractometer  data  collected  using 
a  step-scan  method,  we  develop  a  profile-fitting 
approach  to  primary  data  reduction  based  upon  the 
Bayesian  three-stage  regression  model.  Finally,  we 
indicate  how  a  Bayesian  approach  to  model  choice 
may  lead  to  a  satisfactory  alternative  to  Hamilton’s 
i?-test  as  a  means  of  choosing  between  differing 
molecular  structures  that  result  from  refinements  to 
the  same  data  but  under  different  sets  of  soft 
constraints. 


Introduction 

It  is  a  gross  simplification  to  describe  present  day 
statistical  thinking  as  divided  between  two  schools: 
the  Bayesian  and  the  frequentist.  Nonetheless,  it  is  a 
simplification  that  we  shall  make;  because  by  doing 
so  we  may  introduce  and  emphasise  the  distinctive 
flavour  of  the  Bayesian  approach.  The  difference 
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between  the  two  schools  lies  mainly  in  their  inter¬ 
pretations  of  the  concept  of  probability;  so  we  shall 
begin  there. 

A  frequentist  holds  that  probability  has  meaning 
only  as  the  numerical  representation  of  variability 
actually  present  within  a  system.  Because  of  this  he 
cannot  give  a  technical  probabilistic  meaning  to 
questions  such  as  the  following. 

(i)  What  is  the  most  probable  value  of  a  para¬ 
meter  ? 

(ii)  Within  what  range  of  values  does  an  unknown 
parameter  most  probably  lie  ? 

(iii)  Is  it  improbable  that  a  particular  hypothesis  is 
true? 

The  first  pair  of  questions  is  meaningless  to  him 
because  the  parameters  of  a  model  do  not  vary:  they 
are  fixed  even  if  they  are  unknown.  The  third  question 
is  meaningless  because  the  truth  or  falsehood  of  a 
hypothesis  is  immutable.  None  of  these  questions 
refer  to  variability  actually  present  within  a  system. 
However,  while  they  may  be  meaningless  to  him  within 
his  technical  language,  they  are  the  everyday  expres¬ 
sion  of  his  motives  in  a  statistical  investigation.  It  is 
to  answer  such  questions  that  he  has  developed  his 
methods  of  estimation,  confidence  interval  construc¬ 
tion,  and  hypothesis  testing. 

This  conflict  between  everyday  language  and  the 
technical  language  of  frequentist  statistics  may  seem 
an  esoteric  matter  for  the  philosopher;  however,  it 
does  have  importance  for  the  practical  scientist.  The 
need  to  provide  answers  to  questions  that  cannot  be 
framed  technically  has  resulted  in  frequentist  statistics 
being  based  upon  arguments  that  are  exceedingly 
subtle:  some  would  say,  contorted.  The  necessary 
‘double-think’  of  the  frequentist  approach  makes  it 
particularly  difficult  for  the  scientist  to  learn,  under¬ 
stand,  and  use  the  resulting  methods  of  inference. 
How  many  students  find  their  statistics  courses  easy? 
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More  importantly,  real  experiments  are  seldom  quite 
like  the  stereotypes  found  in  statistical  text-books. 
Thus  to  interpret  an  actual  data  set  it  is  usually 
necessary  to  modify  a  standard  technique  or,  perhaps, 
develop  an  entirely  new  one.  The  subtleties  of  argu¬ 
ment  required  to  complete  these  tasks  are  often  beyond 
the  scientist;  and  many  data  sets  are  poorly  inter¬ 
preted,  information  is  lost,  and,  very  occasionally, 
false  conclusions  drawn. 

The  Bayesian  approach  does  not  lead  to  this  conflict 
of  language.  On  the  contrary,  its  technical  language  is 
a  direct  numerical  representation  of  the  scientist’s 
everyday  language.  Here  probability  is  taken  to 
represent  the  various  degrees  of  belief  or  uncertainty 
that  a  scientist  has  in  the  truth  of  propositions  about 
the  system  under  observation.  For  instance,  the  more 
likely  that  a  parameter  has  the  value  1-54  (say)  the 
higher  is  the  numerical  probability  of  the  proposition 
‘this  parameter  has  the  value  T54’.  All  uncertainty  is 
modelled  through  probability  and,  in  particular,  the 
three  questions  quoted  above  translate  directly  into 
the  technical  language  of  Bayesian  statistics.  As  a 
consequence  Bayesian  argument  is  intuitive,  easily 
learnt,  and,  most  importantly,  easily  developed. 
There  is  no  difficulty  in  constructing  methods  appro¬ 
priate  to  each  particular  investigation. 

Typically  a  Bayesian  analysis  proceeds  as  follows. 
The  scientist  first  develops  a  physical  model  for  the 
system  under  study.  Within  this  model  there  will  be 
many  unknown  parameters,  perhaps  even  unknown 
functional  forms ;  but  nothing  will  be  completely  un¬ 
known.  There  may  be  theoretical  reasons  why  a 
parameter  must  be  positive.  Previous  investigations 
may  limit  the  possible  range  of  a  parameter.  Exact 
functional  forms  may  be  unknown,  but  they  may  be 
expected  to  possess  properties  of  smoothness,  sym¬ 
metry  or  unimodality,  etc.  All  such  prior  information 
may  be  modelled  probabilistically;  and  it  is  much  of 
the  purpose  of  later  sections  of  this  paper  to  indicate 
how  this  may  be  done. 


22  Simon  French  and  Stuart  Oatley 

The  next  step  is  the  design  and  execution  of  an 
experiment.  Here  the  scientist  must  ask  himself  what 
he  would  expect  to  observe  if  he  knew  the  unknown 
quantities  exactly.  He  considers  this  question  for  each 
possible  value  or  form  of  the  unknowns,  and  from 
his  answers  describes  the  structure  of  his  experiment 
probabilistically.  Again  we  shall  illustrate  precisely 
how  this  may  be  done,  later  in  the  paper. 

At  this  point  in  the  analysis  the  scientist  may  sit 
back  and  let  the  basic  laws  of  mathematics  take  over. 
For,  once  the  results  of  the  experiment  have  been 
observed,  all  that  need  be  done  is  to  apply  Bayes’ 
theorem  (hence  the  name  Bayesian  statistics)  in  order 
to  combine  the  prior  information  about  the  unknowns 
with  that  inherent  in  the  experimental  data.  The 
resulting  posterior  distribution  is  the  probabilistic 
representation  of  the  synthesis  of  all  the  information 
that  the  scientist  has;  and  it  provides  the  answers  to 
whatever  questions  might  concern  him.  According  to 
circumstances,  the  mean,  mode,  or  median  of  the 
posterior  distribution  may  serve  as  a  suitable  estimate 
of  the  parameters.  It  is  straightforward  to  calculate 
the  most  probable  range  for  a  parameter;  and  hypo¬ 
thesis  testing  simply  becomes  a  matter  of  comparing 
relative  posterior  probabilities. 

Although  we  shall  avoid  as  much  mathematical 
notation  as  possible,  instead  concentrating  on  illus¬ 
trating  and,  we  hope,  illuminating  Bayesian  methods, 
some  will  be  necessary;  and  expressing  the  above 
paragraphs  symbolically  will  serve  as  an  excellent 
introduction.  The  scientist  begins  with  a  physical 
model,  in  which  there  are  some  unknown  quantities, 
which  we  shall  denote  by  0.  The  prior  information 
about  S  is  represented  by  the  prior  distribution,  which 
we  assume  to  have  the  probability  density  function 
pe(.).  Throughout  this  paper  we  shall  use  subscripts 
to  indicate  quantities  about  which  beliefs  are  being 
expressed;  this  enables  us  to  use  a  lower  case  p  to 
denote  all  probability  density  functions. 

The  experiment  gives  rise  to  an  observation  X, 
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which  the  scientist  expects  to  occur  with  probability 
density  px  (.  |  0).  Note  that  this  distribution  is  condi¬ 
tional  on  the  unknowns  0.  It  describes  the  variation 
that  the  scientist  would  predict  in  his  observations  on 
the  basis  of  his  physical  model  if  he  knew  the  exact 
values  of  the  unknowns  0. 

If  the  observations  X  =  x  are  actually  made,  then 
Bayes’  theorem  gives  the  scientist’s  belief  in  0  up¬ 
dated  by  the  experimental  data  as : 


Pg  (0  | x)  °c  px  (x  [  0) .  p q  (0).  (1) 


0 


The  cc  means  ‘is  proportional  to  as  a  function  of 


6 


0'  ;  x  being  the  fixed,  observed  value.  The  constant  of 
proportionality  is  determined  by  the  condition  that  a 
probability  density  must  integrate  to  unity.  px  (x  |  0), 
when  thought  of  as  a  function  of  0  with  X  fixed  at  the 
observed  value  x,  is  known  as  the  likelihood  function. 
Thus  (1)  may  be  remembered  by 

posterior  oc  likelihood  x  prior. 


The  structure  of  Bayesian  analysis  is  summarised  in 
Fig.  1. 


Initial  knowledge 
of  parameters 


Prior  distribution 


Observation  X  =  x 
gives  likelihood  p  (x'~' 


Bayes' 

theorem 


px(xle)  •  Pe(e) 


Synthesis  of  prior 
knowledge  and 
data 


Posterior  distribution 


Fig.  1 .  The  structure  of  Bayesian  analysis. 
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It  would  be  foolish  to  pretend  that  the  Bayesian 
approach  is  without  its  critics,  but  we  shall  not  discuss 
their  objections  here.  The  references  cited  below  do 
that  far  more  effectively  than  we  could.  However,  we 
do  indicate  and  briefly  discuss  one  important  objec¬ 
tion  in  §  4  of  this  paper.  Our  purpose  in  the  remain¬ 
ing  sections  is  to  introduce  Bayesian  analyses  of  some 
crystallographic  problems.  We  shall  avoid  technical 
details,  which  are  available  elsewhere  in  the  literature, 
and  concentrate  on  providing  an  overview. 

Barnett  (1973)  provides  a  very  readable  account 
of  the  differences  between  Bayesian  and  frequentist 
statistics.  An  excellent  introduction  to  Bayesian 
analysis  may  be  found  in  the  early  chapters  of  Box  & 
Taio  (1973);  the  later  chapters  are  important  for  their 
technical  details.  Other  books  which  develop  the 
Bayesian  approach  are:  Jeffreys  (1961),  Lindley 
(1965),  DeGroot  (1970),  and  De  Finetti  (1974).  Box 
(1980)  is  particularly  important  for  its  setting  of 
Bayesian  statistics  within  the  Scientific  Method. 
Within  the  crystallographic  literature  few  applications 
of  Bayesian  ideas  have  been  reported ;  we  know  only 
of  Mendes  &  De  Polignac  (1973),  French  &  Wilson 
(1978),  French  (1978),  and  Oatley  &  French  (1981). 
We  might  also  refer  to  the  excellent  statement  of  the 
phase  problem  within  a  Bayesian  framework  in  the 
first  few  paragraphs  of  Hauptman  &  Karle  (1953). 


1.  Negative  intensity  observations 

Reflections  with  small  structure  factor  moduli  have 
always  led  to  difficulties.  Their  true  intensities  are,  of 
course,  non-negative,  but  their  observed  intensities 
may  not  be,  because  of  counting  statistics  or  photo¬ 
graphic  recording  errors.  When  a  measured  intensity 
is  positive,  its  square  root  forms  a  sensible  estimate 
of  the  structure  factor  modulus.  However,  what 
should  be  done  when  the  observation  is  negative? 
Various  suggestions  have  been  made:  all  negative 
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observations  should  be  omitted  from  the  data,  i.e. 
treated  as  unobserved ;  they  should  be  set  to  zero  and 
included  in  the  data;  or  they  should  be  set  to  some 
arbitrary,  constant  fraction  of  the  mean  intensity  in 
the  data.  Unfortunately  none  of  these  procedures  is 
entirely  satisfactory.  All  can  lead  to  biases  in  the  final 
structure.  Moreover,  they  can  lead  to  dilficulty  in 
interpreting  Fourier,  particularly  difference  Fourier, 
syntheses  in  the  early  stages  of  the  structure  deter¬ 
mination.  As  shown  in  detail  by  French  &  Wilson 
(1978)  and  as  sketched  briefly  below,  a  Bayesian 
analysis  leads  to  a  natural  and  straightforward  solu¬ 
tion  to  the  problem. 

At  a  given  reflection  the  parameter  that  concerns 
us  is  the  true  intensity.  It  is  arguable,  particularly  in 
the  light  of  the  preceding  discussion,  that  the  true 
structure  factor  modulus  is  the  parameter  of  interest; 
but,  as  will  become  apparent,  this  would  lead  to 
precisely  the  same  results.  Thus  the  unknown  para¬ 
meter  in  our  physical  model  is  the  true  intensity, 
which  we  shall  denote  by  J.  In  our  notation  of  §  1, 
0  =  J.  Our  first  task  then  is  to  consider  our  prior 
knowledge  of  J  and  so  define  the  prior  density 
The  most  obvious  piece  of  information  that  we  have 
is  that  J  is  non-negative;  French  &  Wilson  (1978) 
discuss  a  density  which  embodies  this  and  only  this 
information.  However,  as  they  point  out,  we  in¬ 
variably  know  rather  more  than  just  the  sign  of  J. 
Taken  as  a  whole,  any  moderate  or  large  data  set 
obeys  Wilson’s  (1949)  statistics.  So,  for  an  acentric 
reflection 


(27)-1  exp  (-  JjS) 


if/>  0, 


Pj  ( J ) 


(2) 


0 


otherwise ; 


and  for  a  centric  reflection 


(2tt  2  J)~m  exp  (-  J\2  2)  if  /  >  0, 


(3) 


otherwise. 
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For  small  molecules,  where  one  may  assume  that  the 
atoms  are  uniformly  and  independently  distributed 
about  the  unit  cell,  a  simple  theoretical  derivation 
shows  that  U  is  the  sum  of  squares  of  the  atomic 
scattering  factors  of  all  the  atoms  in  the  unit  cells. 
For  larger  molecules  the  presence  of  secondary 
structure  makes  the  assumption  that  the  atoms  are 
independently  distributed  untenable.  In  such  cases 
French  &  Wilson  conjectured  and  showed  empirically 
that  2  may  be  taken  to  be  the  mean  intensity  in  the 
appropriate  shell  of  reciprocal  space.  Wilson  (1981) 
has  recently  provided  a  theoretical  justification  of  this 
conjecture. 

It  would,  of  course,  be  possible  to  use  prior  densities 
specific  to  space  groups  of  higher  symmetry,  but  to 
our  knowledge  this  has  not  been  done. 

With  the  prior  distribution  now  defined  we  turn  to 
the  experimental  observation.  We  shall  denote  the 
observed  intensity  by  /.  By  ‘observed  intensity’  we 
mean  the  following.  We  assume  that  all  relevant  data 
sets,  collected  either  by  diffractometer  or  photo¬ 
graphic  methods,  have  been  corrected  for  Lorentz, 
polarisation,  absorption,  extinction,  and  radiation- 
damage  effects,  have  been  reduced  to  a  common 
scale,  and  have  been  merged  over  equivalents.  I  is  this 
‘merged  intensity’  containing  all  the  available  obser¬ 
vational  information  at  the  given  unique  reflection. 
All  the  operations  needed  to  produce  this  merged 
intensity  are  assumed  to  have  been  carried  out  on  the 
raw  intensity  measurements,  be  they  positive  or 
negative. 

/  forms  our  experimental  observation;  so  in  the 
notation  of  §  1  we  have  X  =  I.  Hence  we  must  now 
consider  the  probability  density  Pj(’\J)  (i.e.  px(-|0) 
in  §  1).  Throughout  we  shall  assume  that  is  a 

normal  density,  viz. 

I  ~  N(J,  a2).  (4) 

Thus,  aside  from  normality,  we  are  also  assuming 
that  /  is  an  unbiased  observation  on  J  with  known 
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variance  a2.  These  assumptions  are  discussed  by 
French  &  Wilson  (1978). 

The  posterior  distribution  for  J  is  now  given  by 
Bayes’  theorem  (cf  (1)): 

P/J\I)  X  (5) 

Note  that  pj(J\l)  =  0  for  /  <  0  because pj(J )  occurs 
multiplicatively  in  (5)  and  is  zero  for  this  range  of  J 
(see  (2)  and  (3)).  Thus  our  prior  knowledge  that  J 
must  be  non-negative  is  carried  through  to  the 
posterior  distribution. 

Most,  if  not  all,  crystallographic  structure  solutions 
do  not  use  the  posterior  density  for  the  intensity  in  its 
entirety;  but  use  approximations  based  upon  its  mean 
and  variance,  or  upon  the  mean  and  variance  of  its 
square  root,  the  structure  factor  modulus.  Least 
squares  refinement  is  an  extremely  common  example 
of  a  solution  technique  that  requires  just  these.  So 
usually  we  do  not  need  the  full  density  pj(J\l),  but 
only  its  moments: 

E/J\  I)  =  /  J-Pj(J\I).dJ,  (6) 

Var//|7)  =  /  (J-Ej(J\I))2-Pj(J\l)-dJ,  (7) 
or,  letting  F  =  VJ, 

Ej(F\I)  =  f  F‘Pj(J\l)'dJ,  (8) 

Var/F|7)  =  /  (F  —  Ej(F\I))2  •pJ(J\l)  -dJ.  (9) 

Earlier  we  stated  that  it  did  not  matter  whether  we 
took  the  true  intensity  or  the  true  structure  factor 
modulus  as  the  parameter  of  interest.  Expressions 
(8)  and  (9)  confirm  that  by  taking  J  as  the  parameter 
we  do  not  lose  the  ability  to  estimate  F  and  to  give  a 
guide  to  the  precision  of  this  estimate.  Indeed,  had  we 
taken  F  as  the  parameter,  our  analysis  would  have  led 
to  expressions  completely  equivalent  to  (6)  -  (9). 
Our  prior  distribution  pF(  •)  would  have  been  derived 
from  Wilson’s  statistics  for  the  structure  factor 


28  Simon  French  and  Stuart  Oatley 

modulus  and  would  thus  have  been  (2)  or  (3)  with  the 
change  of  variable  F  =  Vj  and  the  introduction  of 
the  appropriate  Jacobian  \dJ/dF\.  The  Jacobian 
would  have  remained  throughout  the  resulting 
analysis,  ensuring  that  it  was  identical  to  that  above 
but  with  a  simple  change  of  variable. 

Expressions  (6)  -  (9)  may  look  horrendous,  but 
they  are  nonetheless  relatively  easy  to  evaluate  or,  at 
least,  approximate.  French  &  Wilson  (1978)  give 
details. 

It  should  be  noted  that  this  Bayesian  analysis 
applies  to  all  reflections,  whatever  the  observed 
intensities.  There  are  no  ad-hoc  cut-off  points  with 
positive  observations  treated  one  way  and  negative 
another.  This  consistency  of  treatment  is  typical  of 
the  Bayesian  approach  and  adds  much  to  its  intuitive 
appeal. 

In  Fig.  2  we  summarise  and  illustrate  our  analysis. 
Lewis  (1981)  has  recently  extended  and  developed 
these  ideas  to  include  anomalous  scattering  informa¬ 
tion.  He  was  interested  not  in  estimating  the  intensities 
themselves,  but  rather  anomalous  differences  across 
Bijvoet  pairs.  Using  appropriate  prior  distributions 
drawn  from  Srinivasan  &  Parthasarathy  (1976)  and 
taking  the  differences  in  measured  intensities  across 
Bijvoet  pairs  as  his  observations,  he  has  produced 
sensible  estimates  of  anomalous  differences  and  from 
these  successfully  located  the  heavy  atoms  in  an 
isomorphous  protein  derivative.  Without  the  Bayesian 
method  these  atoms  had  proved  difficult  to  locate. 

2.  The  Bayesian  three-stage  model 

No  physical  model  perfectly  explains  data,  however 
free  they  are  from  experimental  error.  There  is  always 
some  discrepancy  between  the  mathematical  behaviour 
of  the  model  and  the  actual  behaviour  of  the  system 
being  modelled.  Usually  this  modelling  error  is  several 
orders  of  magnitude  smaller  than  the  experimental 
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Fig.  2.  Illustration  of  the  Bayesian  analysis:  (a)  when  the 
observation  is  positive;  ( b )  when  the  observation  is  negative. 
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error  present,  and  thus  can  safely  be  ignored.  However, 
there  are  occasions  when  the  modelling  error  is 
sufficiently  large  to  require  special  treatment  in  the 
statistical  analysis.  In  this  section  we  explain  how 
modelling  error  can  be  included  in  a  Bayesian  analysis. 
To  do  this  it  will  be  necessary  to  return  to  (1)  and 
discuss  Bayes’  theorem  a  little  further. 

Bayes’  theorem, 

pe  (0  |  x)  oc  px  (x  |  0) -pe  (0),  (10) 

6 


is  really  little  more  than  the  re-expression  of  the  joint 
distribution  of  X  and  0.  By  the  definition  of  condi¬ 
tional  probability  densities  (ignoring  the  niceties  of 
measure,  i.e.  integration  theory)  we  have: 


P9  (0 1 x) 


Px,  e  ( x >  0) 

PX  (x) 


oc  px  g  (x,  0) 
e 


(11) 


since px(x)  is  independent  of  0.  px  0(...)  is,  of  course, 
the  joint  density  of  X  and  0.  Again  by  definition  we 
have 


Px  (x  |  0) 


Px,  0  (x>  9) 

Pe  (0) 


PX,  e  (x>  9)  =  Px  (x  1 9)  'Pe  (0)*  (12) 


So  combining  (11)  and  (12)  we  obtain 

Pfl  (0 1  x)  oc  px  q  (x,  0)  ~ Px  (x  I  0)  ‘ P o  (e)*  (13) 

d 

Usually  we  take  the  proportionality  of  the  first  and 
last  terms  of  (13)  as  the  basis  of  a  Bayesian  analysis 
because  doing  so  structures  the  logical  development 
in  the  intuitive  form: 


posterior  oc  likelihood  X  prior. 
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However,  here  we  shall  find  it  convenient  to  use  the 
proportionality  of  the  first  and  second  terms  in  (13); 
that  is 


posterior  oc  joint  density  along  X  =  x. 

Until  now  we  have  been  assuming  that  the  unknown 
quantities  in  the  physical  model  have  been  sufficient  to 
define  completely  the  statistical  parameters  in  the 
probabilistic  description  of  the  experiment,  px  (•  |  g). 
In  other  words,  we  have  assumed  that  the  only  un¬ 
knowns  upon  which  the  distribution  of  X  depends 
are  those  of  the  physical  model.  This  is  equivalent  to 
assuming  that  there  is  no  modelling  error.  Hence  we 
must  make  the  distribution  of  X  depend  on  para¬ 
meters  other  than  those  in  the  physical  model.  We 
shall  need  the  following  notation. 

02  —  the  parameters  of  the  physical  model;  i.e. 
02  represents  the  unknown  quantities  which 
we  have  previously  denoted  by  an  unsub- 
scripted  0. 

0X  —  the  statistical  parameters  (mean,  variance, 
etc.)  of  the  distribution  of  X. 

With  these  we  may  define  a  structuring  of  Bayesian 
analysis  known  as  the  three-stage  model. 

Stage  III:  Prior  knowledge 

The  scientist’s  prior  knowledge  of  the  parameters  in 
his  physical  model  are  represented  by  the  prior 
density :  pB^  (•)• 

Stage  II:  Modelling  error 

The  scientist’s  beliefs  about  the  adequacy  of  his 
physical  model  are  represented  by  the  probability 
density:  pBi  (•  1 02).  In  other  words,  pdi  (•  |  02)  describes 
the  scientist’s  relative  beliefs  in  the  different  possible 
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values  of  the  statistical  parameters  of  X  given  the 
particular  values  02  of  the  unknowns  in  his  physical 
model. 

Stage  I:  Observation  error 

The  observations  X  have  a  distribution  with  para¬ 
meters  01}  i.e.  the  probability  density  of  X  ispx  (•  |  0^. 

How  these  distributions  may  be  defined  in  practical 
cases  will  be  illustrated  in  the  next  section. 

Combining  these  three  densities  gives  the  joint 
density  of  X,  0l5  and  02: 

Px,  ex,  02  ~Px  (^  1 9i)  ‘Pel  (0i  1 62)  ‘Pet  (0a)-  (14) 

After  the  observation  X  =  x  has  been  made,  the 
posterior  joint  density  of  0X  and  02  conditional  on  x 
is  given  by 

Peu  et  (01*  02 1  x)  °c  Px,  elt  e2  (x>  0i>  02>>  (15) 

8 1,  82. 

i.e.  the  posterior  joint  density  of  Q1  and  02  is  propor¬ 
tional  to  the  joint  density  of  X,  0l5  and  02  along  X  =  x. 
Usually  the  scientist’s  interest  is  centred  on  02  alone, 
the  unknowns  in  his  physical  model.  In  that  case  he 
needs  the  marginal  posterior  density : 

P6i  (02  [  x)  =  J  P6lt  e2  (0i»  02 1  x)  •  d0i  (16) 

This  marginal  density  is  the  probabilistic  representa¬ 
tion  of  the  synthesis  of  his  prior  knowledge  and  the 
information  inherent  in  the  data,  due  allowance 
having  been  made  for  modelling  error. 

The  Bayesian  three-stage  model  is  discussed  in 
detail  by  Lindley  &  Smith  (1972),  Smith  (1973),  and 
French  (1978).  In  the  next  section  we  shall  illustrate 
its  application  to  the  estimation  of  a  reflection’s 
intensity  from  diffractometer  data  collected  in  step- 
scan  mode.  We  shall  avoid  giving  technical  details 
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and  instead  concentrate  on  illustrating  how  the 
probability  densities  in  each  of  the  three  stages  may 
be  developed  so  as  to  fairly  represent  the  available 
information. 


3.  A  profile-fitting  method  for  the  analysis  of 
diffractometer  intensity  data 

For  a  diffractometer  operating  in  step-scan  mode,  each 
reflection  is  recorded  by  measuring  a  sequence  of  N 
counts  as  the  machine  steps  across  the  peak  and  its 
local  background.  Each  count  Q  is  an  observation 
on  the  true  (mean)  count  Af  at  the  /th  step. 

Thus 


Ct  ~  PCi  (•  |  A,),  /  =  1,  2,  ...,  iV,  (17) 

where  the  notation  indicates  that  each  Ct  is  drawn 
from  a  distribution  with  parameter  A£.  The  distribu¬ 
tions  Pc  (•  ( A,)  are  approximately  Poisson  (‘counting 
statistics’)  with  means  Xh  but  are  perturbed  slightly 
through  instrument  instability  and,  for  extremely 
intense  reflections,  saturation  counting  losses.  The 
latter  effect  does  not  occur  in  the  weak  data  sets  of 
protein  crystallography,  the  area  of  our  experience, 
so  we  shall  ignore  it.  However,  our  analysis  could  be 
modified  to  take  account  of  its  presence. 

Each  Af  is  the  sum  of  two  elements :  a  contribution 
from  the  reflection’s  intensity  and  a  contribution  from 
the  local  background.  Thus 

Aj  =  /  •  77  ( x< )  +  b  (. xt ),  i  =  1,2,...,  N,  (18) 

where  J  is  the  integrated  intensity ;  tt  (x)  is  the  peak 
shape  function,  so  \tt(x)  •  dx  =  1 ;  xt  is  the  position 
in  the  scan  of  the  /th  step ;  and  b  (x4)  is  the  background 
scatter  at  xt.  The  problem  is  to  obtain  an  estimate  of 
J  together  with  some  indication  of  the  precision  of  this 
estimate. 


C.  S.— 3 
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The  data  available  for  the  estimation  of  J  clearly 
include  the  measured  counts  (cv  c2, cN),  but  there 
are  other  sources  of  information  which  are  often 
overlooked.  In  short,  these  are :  (i)  the  local  behaviour 
of  the  background,  which  may  be  predicted  from  the 
collection  geometry;  (ii)  the  properties  expected  in 
the  peak  shape,  e.g.  continuity  and,  in  many  cases, 
unimodality;  (iii)  the  shape  of  the  peaks  already 
analysed,  since  peak  shape  tends  to  vary  only  slowly 
through  reciprocal  space  (Diamond,  1969);  (iv)  the 
position  within  the  scan  of  the  last  measured  peak  and 
the  reliability  of  the  diffractometer  in  moving  from 
one  reflection  to  the  next;  and  (v)  for  a  multiple 
counter  diffractometer,  the  relative  position  of  the 
peaks  within  the  simultaneously  collected  scans  may 
be  predicted  from  the  collection  geometry  and,  more¬ 
over,  the  peak  shapes  and  backgrounds  on  these  scans 
will  usually  be  very  similar. 

Various  methods  have  been  proposed  for  the  esti¬ 
mation  of  J.  The  majority  are  summarised  and  dis¬ 
cussed  by  French  (1975),  and  Oatley  &  French  (1981). 
None  of  the  methods  makes  use  of  all  the  sources  of 
information  listed  above;  indeed,  most  base  their 
estimate  on  the  sequence  of  measured  counts  alone, 
ignoring  sources  (i)  to  (v)  entirely.  Furthermore, 
many  lead  to  positively  biased  estimates  of  the  intensi¬ 
ties  and  poor,  occasionally  theoretically  incorrect, 
indications  of  the  precisions  of  these  estimates. 

The  profile-fitting  method  which  we  describe  here 
appears  to  overcome  all  the  difficulties  encountered 
by  other  methods.  Its  development  within  the  frame¬ 
work  of  the  Bayesian  three-stage  model  is  entirely 
natural  and  straightforward,  illustrating  the  power  of 
the  Bayesian  approach  to  organise  thought. 

Expressions  (17)  and  (18)  indicate  that  the  expected 
value  of  C, 

E(Ct)  =  *t=J-ir(xt)  +  b(xd  (19) 

for  i  =  1,  2,  ...,  N.  Our  approach  is  to  fit  the  vector 
of  observed  counts  with  a  function  of  the  form 
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( J.-n{x ,  a)  +  b  (x,  /3)),  where  -n  (x,  a)  and  b  (x,  /3)  are 
parametric  approximations  to  the  true,  but  unknown 
peak  shape  and  background  functions  respectively. 
Since  the  peak  shape  function  should  integrate  to  unity, 
obvious  candidates  for  7t(x,  a)  are  probability  density 
functions.  We  have  found  that  Johnson’s  (1949)  sugges¬ 
tion  fortrans  forming  the  normal  density  curve  leads  to 
families  of  curves  which  well  approximate  the  peak 
shapes  that  arise  in  protein  crystallography.  In  parti¬ 
cular,  this  choice  for  tt(x,  a)  embodies  the  information 
that  the  underlying  peak  shape  is  continuous  and 
unimodal.  Fig.  3  illustrates  the  wide  variety  of  conti¬ 
nuous,  unimodal  curves  that  can  result  from  this 
choice  of  n(x,  at). 

For  the  backgrounds  we  have  found  that  a  linear 
approximation  is  adequate;  viz.  b(x,  /3)  =  &  +  /S2x. 


Fig.  3.  Examples  of  peak  shapes  obtainable  for  our  choice  of 
*  (x,  a). 
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Usually  the  background  is  very  nearly  constant 
across  the  scan,  so  {t2  &  0. 

Other  choices  of  n(x,  *)  and  b(x,  /3)  may  be  more 
appropriate  in  other  branches  of  crystallography  or 
for  other  collection  geometries.  Obviously,  if  the  data 
indicate  that  the  true  peak  shape  is  bimodal,  either 
because  the  a-doublet  is  resolved  or  because  the 
crystal  has  split,  then  it  is  not  sensible  to  use  a  uni- 
modal  choice  of  n(x,  a).  However,  it  should  be  noted 
that  our  profile-fitting  remains  applicable  whatever 
choices  are  made. 

We  now  set  the  problem  of  fitting  ( J.tt(x ,  a)  + 
b(x,  /3))  to  the  observed  vector  of  counts  into  the 
structure  of  a  Bayesian  three-stage  model. 

Stage  III:  Prior  knowledge 

Firstly,  let  us  note  the  unknown  parameters  within 
our  model.  They  are  /,  the  true  intensity;  *,  the 
parameters  of  the  approximate  peak  shape  function; 
and  /3,  the  parameters  of  the  approximate  background 
function. 

Typically  we  have  no  prior  knowledge  of  J;  we  wish 
to  determine  its  value  form  the  data  alone.  We  do  this 
through  a  vague  prior  distribution.  A  vague  prior 
distribution  is  one  which  states  that  nothing  is  known 
about  the  relevant  quantity  other  than  the  information 
contained  in  the  experimental  data.  Suppose  J  can  be 
maximally  105,  but  is  usually  of  the  order  of  a  few 
hundred.  The  prior  distribution 

J  ~  N( 0,  1020),  (20) 

i.e.  normal  with  zero  mean  and  enormous  variance, 
has  an  effectively  constant  density  over  the  plausible 
range  of  J;  and  thus  does  not  differentially  weight  the 
possible  values  of  J.  Hence  the  data  alone  will  deter¬ 
mine  the  posterior  distribution  of  J. 

The  peak  parameters  a  are  responsible  for  deter¬ 
mining  the  position  of  the  peak  within  the  scan,  the 
width  of  the  peak,  and  properties  of  the  peak  shape 
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such  as  skewness  and  kurtosis  (‘peakedness’).  Once 
several  peaks  have  been  fitted,  we  will  have  learnt 
much  about  these  qualities,  and  hence  about  *. 
Suppose  that  we  are  setting  the  prior  for  the  sth 
reflection,  after  successfully  fitting  the  profile  at  the 
( s  —  l)th  reflection,  adjacent  to  it  in  reciprocal  space. 
Suppose  further  that  the  posterior  distribution  for  the 
parameters  after  the  (s  —  l)th  reflection  is 

«_i  -  N(ms_v  Ws_ x).  (21) 

For  our  collection  geometries  there  are  no  predictable 
changes  in  <*  between  reflections;  but,  remembering 
information  sources  (iii)  and  (iv)  above,  it  is  known 
that,  first,  the  peak  shape  varies  only  very  slowly 
across  reciprocal  space  and,  second,  even  if  there  is 
crystal  slippage,  the  peak  position  within  one  scan 
will  be  very  close  to  that  of  the  previous  reflection. 
Hence  we  may  use  ms_x  as  the  prior  mean  for  *s,  but 
should  increase  the  diagonal  terms  of  H/s_1  slightly 
to  form  the  prior  variance.  This  slight  inflation  of 
the  diagonal  allows  for  the  unpredictable,  but  small 
changes  in  the  peak  parameters  between  reflections. 

Finally  a  prior  distribution  must  be  set  for  /3-  We 
shall  assume  the  form :  b(x,  /3)  =  &  +  In  most 
cases,  we  use  the  prior  distribution 


The  variance  of  1020  corresponds  to  a  vague  prior 
for  the  background  level  j3x,  while  the  variance  10-10 
forces  the  background  slope  to  remain  very  close  to 
zero.  If  it  is  believed  that  the  background  actually 
slopes,  a  more  appropriate  value  than  10-10  should 
be  chosen;  e.g.  if  /32  is  likely  to  be  in  the  range  —  1 
to  1,  a  prior  variance  of  1  would  be  reasonable.  Thus 
through  (22)  we  introduce  our  prior  knowledge  of 
the  local  behaviour  of  the  background  (information 
source  (i) ). 
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Further  details  of  how  prior  knowledge  may  be 
modelled  at  stage  III  are  discussed  by  French  (1978), 
and  Oatley  &  French  (1981),  where  ways  of  treating 
multiple-counter  data  are  also  indicated.  We  would 
emphasise  that  our  specific  suggestions  above  are 
appropriate  to  our  experience  within  protein  crystallo¬ 
graphy.  They  may  need  modifying  for  other  applica¬ 
tions;  however,  the  general  principles  would  still 
hold. 

Stage  II:  Modelling  error 

This  stage  describes  how  well  it  is  expected  that 
the  true  count  A£  will  be  modelled  by  the  parametric 
approximation 

v.  =  J-niXi,  a.)  +  b(xt,  ft).  (23) 

We  assume  that  the  approximation  is  unbiased,  i.e. 

E(  At)  =  vt.  (24) 

Also  it  seems  reasonable  to  expect  the  modelling  error 
to  increase  with  the  magnitude  of  vt.  In  fact,  we 
assume  rather  more  than  this:  namely,  that  the 
modelling  error  has  constant  relative  variance,  viz. 

Var  (Af  |  vt)  —  o\-v\  (25) 

where  a\  is  constant  for  all  steps  in  the  scan.  In  the 
solution  of  the  three-stage  model  these  assumptions 
are  modified  very  slightly.  There  are  computational 
advantages  in  working  with  V A,  and  V vt  rather  than 
A;  and  themselves.  So  we  make  this  transformation 
and  assume  that 

~  N(V^~,  0 -25-ol-Vi).  (26) 

In  strict  probabilistic  terms  this  contradicts  (24)  and 
(25),  because  EW A()  ^  V F^)  =Vvt  and  because  the 
variance  in  (26)  is  not  a  completely  accurate  trans¬ 
formation  of  that  given  by  (25).  However,  numerically 
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no  great  error  is  introduced  (see  French  (1978), 
Appendix  B). 

Further  discussion  of  stage  II,  in  particular  of  the 
problem  of  setting  a  reasonable  value  for  o\,  is  given 
by  French  (1978),  and  Oatley  &  French  (1981). 

Stage  I:  Observation  errors 

This  stage  models  the  counting  and  instrument 
instability  errors  in  the  observations.  If  we  temporarily 
ignore  the  latter  errors,  which  are  small,  then  we  know 
that  Pc.( ■  |  Af)  would  be  Poisson.  Now,  if  the  distri¬ 
bution  of  C;  is  Poisson,  the  distribution  of  V Ct  may 
be  extremely  well  approximated  by  a  normal  distri¬ 
bution  with  a  constant  variance  of  0-25  (Box  &  Taio, 
1973,  Fig.  T3-8).  Thus  we  have 

VQ  ~  N(V\,  0-25).  (27) 

The  presence  of  instrument  instability  error  should  not 
disturb  this  distribution  greatly,  although  it  will 
inflate  the  variance.  McCandlish,  Stout  &  Andrews 
(1975),  amongst  others,  have  argued  that  machine 
instability  gives  rise  to  errors  of  approximately 
constant  relative  variance  in  the  counts.  Letting  o\  be 
this  constant  relative  variance  and  assuming  that 
instrument  instability  is  independent  of  the  counting 
errors,  we  have 

VQ  ~  N(VYh  0-25  (1  -f-crf-Ai)).  (28) 

This  completes  our  brief  description  of  the  three- 
stage  model  that  underlies  our  profile-fitting  method ; 
full  details  are  given  by  French  (1978),  and  Oatley  & 
French  (1981).  There  we  discuss  the  very  important 
question  of  how  to  set  the  prior  distributions  initially, 
before  any  reflections  have  been  measured,  and  also 
the  question  of  how  to  set  them  when  the  previously 
fitted  reflection  was  far  away  in  reciprocal  space,  not 
adjacent  as  assumed  here. 

With  all  the  distributions  defined,  it  is  a  conceptual- 
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ly  easy,  though  by  no  means  computationally  trivial, 
task  to  produce  the  posterior  distribution  for  J  given 
the  observed  counts.  The  mean  of  this  serves  as  our 
estimate  of  the  integrated  intensity,  and  the  variance 
indicates  the  precision. 

The  method  works  well  in  practice;  in  the  past  six 
years  it  has  become  the  standard  method  of  producing 
the  integrated  intensities  of  diffractometer  data  in  the 
Laboratory  of  Molecular  Biophysics  in  Oxford.  We 
present  just  three  examples  to  illustrate  its  value  here; 
others  may  be  found  in  Oatley  &  French  (1978). 

In  these  examples  we  compare  the  performance  of 
profile-fitting  with  that  of  the  ordinate-analysis 
(Watson  et  al.,  1970)  and  centroid  methods  (Tickle, 
1975).  Both  these  use  the  sequence  of  measured 
counts  to  centre  a  window  on  the  peak.  The  peak  is 
assumed  to  lie  entirely  within  the  window,  and  the 
integrated  intensity  is  taken  to  be  the  difference 
between  the  total  count  within  the  window  and  the 
(appropriately  scaled)  total  count  outside  the  window. 

Fig.  4  illustrates  the  profile-fitting  of  three  peaks 
from  a  consecutive  sequence  of  eight  reflections  in 
some  cubic  insulin  data  (Dodson  et  al.,  1978).  The 
ability  of  the  method,  guided  by  its  prior  knowledge  of 
local  behaviour  of  the  background,  peak  shape,  and 
peak  position,  to  distinguish  signal  from  noise  is 
illustrated  in  Peak  4.  Here  ordinate  analysis  has 
defined  a  peak  window  too  near  the  start  of  the  scan, 
resulting  in  overestimation  of  the  intensity.  It  can  also 
be  seen  that  as  a  result  of  crystal  movement  the  peak 
position  has  shifted  considerably  during  this  sequence, 
and  it  is  encouraging  to  note  how  well  this  has  been 
tracked  by  the  fitting.  This  movement  has  resulted  in 
the  last  peak,  and  also  the  sixth  and  seventh,  being 
seriously  ‘clipped’;  this  would  invalidate  other 
methods  of  step-scan  integration,  but  our  profile¬ 
fitting  method  is  able  to  extract  valid  intensity 
information  from  the  scan. 

Fig.  5  illustrates  three  reflections  taken  from  some 
2-Zn  insulin  data  (Dodson  et  al.,  1979).  These 
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Fig.  4.  Profile-fitting  of  three  reflections  from  cubic  insulin 
data.  The  window  positions  determined  by  ordinate  analysis 
are  indicated  by  the  thick  line  at  the  base  of  each  diagram. 


Peak  1 :  Ordinate  analysis : 

Profile-fitting : 
Peak  4 :  Ordinate  analysis : 

Profile-fitting: 
Peak  8 :  Ordinate  analysis : 
Profile-fitting: 


Intensity  938,  s.d.  38. 
Intensity  952,  s.d.  41. 
Intensity  132,  s.d.  25. 
Intensity  88,  s.d.  22. 
Intensity  2623,  s.d.  56. 
Intensity  2723,  s.d.  61. 


N.B. :  Although  the  calculated  standard  deviations  make 
ordinate  analysis  appear  the  more  accurate  method  in  two  of 
the  three  cases,  this  is  not  so.  Ordinate  analysis  standard 
deviations  are  based  upon  a  theoretically  incorrect  formula 
— it  makes  no  allowance  for  the  random  centring  of  the 
window — and,  moreover,  do  not  include  a  contribution  for 
instrument  instability  errors. 


reflections  lie  close,  but  not  adjacent,  to  each  other 
in  reciprocal  space.  For  these  centroid  and  ordinate 
analysis  produced  essentially  the  same  results ;  profile- 
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Fig.  5.  Profile-fitting  of  three  reflections  from  2-Zn  insulin 
data.  The  window  positions  determined  by  ordinate  analysis 
are  indicated  by  the  thick  line  at  the  base  of  each  diagram. 


Peak  A :  Ordinate  analysis : 

Profile-fitting: 
PeakB:  Ordinate  analysis: 

Profile-fitting: 
Peak  C:  Ordinate  analysis: 
Profile-fitting: 


Intensity  113,  s.d.  52. 
Intensity  125,  s.d.  54. 
Intensity  145,  s.d.  47. 
Intensity  35,  s.d.  49. 
Intensity  127,  s.d.  49. 
Intensity  128,  s.d.  56. 


N.B. :  See  note  to  Fig.  4  concerning  standard  deviations. 


fitting  agrees  with  them  on  the  first  and  third  peaks, 
but  produces  a  much  more  sensible  value  for  the 
intensity  of  the  second  reflection. 

Fig.  6  illustrates  the  analysis  of  some  data  that  was 
collected  on  a  five-counter  diffractometer.  Ordinate 
analysis  and  the  centroid  method  have  both  been 
modified  to  locate  the  peak  window  on  a  combined 
profile  from  the  five  simultaneously  collected  scans, 
thus  using  prior  knowledge  of  the  relative  positions 
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Fig.  6(a).  Profile-fitting  of  reflections  from  the  multiple  coun¬ 
ter  prealbumin  data  sets.  The  window  positions  calculated 
by  the  centroid  method  are  indicated  by  the  thick  line  at  the 
base  of  each  scan.  The  integrated  intensities  calculated  by 
the  two  methods  are  given  below. 


Counter : 

1 

2 

3 

4 

5 

Centroid  method : 

65 

229 

189 

581 

175 

Profile-fitting: 

86 

174 

182 

567 

182 

of  the  five  peaks  (Banner  et  al.,  1977).  Because  of  this, 
these  methods  are  far  more  reliable  than  when  used 
on  single-counter  diffractometers.  Nonetheless,  we 
have  found  that  our  profile-fitting  can  still  offer  a 
significant  improvement.  The  three  quintuplets  shown 
here  are  taken  from  high  resolution  data  for  native 
prealbumin  (Oatley,  1976;  Blake  et  al.,  1978),  and  for 
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Fig.  6(b).  Profile-fitting  of  reflections  from  the  multiple  coun¬ 
ter  prealbumin  data  sets.  The  window  positions  calculated 
by  the  centroid  method  are  indicated  by  the  thick  line  at  the 
base  of  each  scan.  The  integrated  inrensities  calculated  by 
the  two  methods  are  given  below. 

Counter:  1  2  3  4  5 

Centroid  method :  11  —14  93  10  —12 

Profile-fitting:  0  4  44  22  24 

prealbumin  with  T3  and  Ti  bound  (Oatley  &  Burridge, 
1981).  Fig.  6(a)  shows  a  quintuplet  where  both  the 
centroid  method  and  profile-fitting  agree.  Fig.  6(b) 
shows  a  very  weak  quintuplet  where  the  centroid 
method  has  failed  completely  to  locate  the  window 
sensibly.  Profile-fitting  has  not  been  misled  by  the 
generally  higher  counts  in  the  early  part  of  the  scan; 
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Fig.  6(c).  Profile-fitting  of  reflections  from  the  multiple  coun¬ 
ter  prealbumin  data  sets.  The  window  positions  calculated 
by  the  centroid  method  are  indicated  by  the  thick  line  at  the 
base  of  each  scan.  The  integrated  intensities  calculated  by 
the  two  methods  are  given  below. 

Counter:  1  2  3  4  5 

Centroid  method:  132  144  17  151  564 

Profile-fitting:  112  247  95  84  545 

to  position  the  peak  there  would  mean  moving  too 
far  from  the  positions  of  previously  fitted  peaks. 
Even  when  the  centroid  method  finds  a  sensible 
window  position,  the  random  nature  of  the  counts 
may  lead  it  to  produce  unsatisfactory  integrated 
intensities.  Surely  the  profile-fitted  values  are  the 
more  sensible  for  the  quintuplet  shown  in  Fig.  6(c). 
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4.  Restrained  refinement,  hypothesis  testing, 

and  Hamilton’s  R-test 

It  is  unusual  today  to  refine  the  structural  parameters 
of  moderate  and  large  molecules  against  intensity 
data  alone.  Restraints  (variously  known  as  slack 
constraints,  soft  constraints,  and  pseudo-observations) 
are  introduced  into  the  refinement  to  encourage, 
but  not  force,  bond  lengths,  bond  angles,  etc.  to  take 
sensible  values,  i.e.  values  similar  to  those  found  in 
previous  structural  determinations.  French  (1978)  has 
shown  that  the  Bayesian  three-stage  model  provides 
a  natural  framework  for  discussing  such  refinements; 
however,  we  shall  not  discuss  that  here.  Instead  our 
concern  is  with  choosing  between  alternative  refined 
structural  models. 

Consider  a  specific  example.  Suppose  that  in  a 
protein  dimer  it  is  clear  that  the  two  molecules  are 
very  nearly  related  by  a  non-crystallographic  two-fold 
axis.  The  question  is  whether  they  are  exactly  related 
so.  The  approach  currently  adopted  to  answer  this 
question  is  to  carry  out  two  restrained  refinements: 
the  first  with  the  two  molecules  constrained  to  obey 
the  two-fold  symmetry  relation  exactly;  the  second 
with  the  atoms  in  the  two  molecules  free  to  move 
independently,  and  thus  with  twice  the  number  of 
parameters  and  restraints  used  in  the  first  refinement. 
Hamilton’s  (1965)  R-test  is  then  applied  to  see  if  there 
is  a  significant  difference  between  the  residuals  in  the 
two  refinements.  Unfortunately  the  theory  of  the 
R-test  is  not  applicable  to  this  situation.  Neither,  for 
that  matter,  is  the  theory  of  any  of  the  alternatives  to 
the  R-test  that  have  been  proposed  (Rogers,  1981; 
Rothstein,  Richardson,  &  Bell,  1978).  However,  see 
Critchley  (1980)  for  a  sensible,  if  ad-hoc,  approach  to 
using  Hamilton’s  R-test  in  such  situations. 

The  reason  why  these  hypothesis  tests  are  inappli¬ 
cable  is  that  they  are  designed  to  check  whether  a 
given  hypothesis  explains  a  set  of  data  to  within 
experimental  error  or  whether  a  more  general  hypo- 
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thesis  is  necessary  to  explain  the  same  data.  Now 
restraints  are  effectively  extra  experimental  obser¬ 
vations  on  the  system  under  investigation;  hence 
French  (1978)  termed  them  pseudo-observations.  If 
one  molecular  model  is  refined  against  a  different  set 
of  restraints  to  the  other,  then  they  are  effectively 
refined  against  different  data  sets;  and  so  frequentist 
hypothesis  testing  theory  does  not  apply. 

As  always  the  problem  is  that  frequentist  statistics 
provide  no  framework  for  handling  prior  knowledge. 
The  Bayesian  framework,  on  the  other  hand,  does 
allow  for  prior  knowledge;  indeed,  it  insists  that  the 
scientist  must  always  know  something  about  the  un¬ 
knowns  in  his  model,  be  it  only  whether  they  are  real 
or  complex.  Thus  we  may  expect  that  a  Bayesian 
hypothesis  test  may  be  developed  to  compare  different 
structural  models  that  have  resulted  from  restrained 
refinements.  We  shall  not  develop  such  a  method  in 
any  great  detail  here;  however,  we  shall  indicate  how 
it  may  be  done. 

We  shall  suppose  that  the  scientist  has  two  alter¬ 
native  physical  models,  Mx  and  Mz,  which  he  wishes 
to  compare.  These  models  are  to  be  taken  as  functional 
forms  involving  unknown  quantities,  and  we  consider 
them  before  any  restrained  refinement  has  taken  place. 
We  shall  let  PM{Mt )  be  the  scientist’s  prior  belief  in 
model  Mi  before  the  refinement  (/=  1,  2).  We  discuss 
below  the  contentious  issue  of  how  Pm{M^)  may  be 
set  numerically.  As  we  have  remarked,  there  will  be 
unknown  quantities  within  each  model;  let  0f  be  those 
unknowns  within  Mt(i—  1,  2).  (Here  subscripts  on 
0’s  refer  to  models,  not  stages.)  Within  the  context  of 
Mi  the  scientist  will  have  prior  beliefs  about  0f;  let 
P0  ( •  J  Mi)  be  the  prior  density  representing  these 
beliefs.  In  other  words,  P0  ( •  |  Mt)  represents  his 
beliefs  if  for  the  time  being  he  assumes  that  Mt  is  the 
true  model. 

Next  he  considers  the  experiment  with  its  obser¬ 
vations  X.  Let  PX  ( •  |  Qi,  M^  be  the  density  which 
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describes  his  expectation  of  the  observations  if  he 
assumes  that  Mt  is  the  true  model  and  that  the  un¬ 
knowns  take  the  values  0,. 

After  I=x  has  been  observed,  Bayes’  theorem  may 
be  invoked  to  show  that  the  ratio  of  posterior  pro¬ 
babilities  of  Mi  and  M2  is 

ilx) _ S  ^i)7,e1(0il^i)'<i0i  /(^i) 

/VA/a|x)  f  P x(xl  Pm{M2) 

(29) 

Noting  that  Pm(Mx  |  x)+Pm(M2  |  x)=  1,  we  can  calcu¬ 
late  the  posterior  probabilities  of  the  models.  These 
are  natural  criteria  for  choosing  between  M1  and  M2. 
The  more  the  data  and  prior  knowledge  support  a 
model  the  larger  will  be  its  posterior  probability  and 
the  smaller  that  of  the  alternative.  Various  authors 
have  considered  the  form  of  (29)  for  different  distri¬ 
butional  assumptions:  Jeffreys (1961),  Lempers (1971), 
and  Smith  &  Spiegelhalter  (1980)  are  a  useful  source 
of  reference.  It  is  possible  to  combine  that  Bayesian 
approach  to  restrained  refinement  discussed  by  French 
(1978)  with  the  above  development,  and  thus  produce 
an  alternative  to  Hamilton’s  i?-test.  However,  there 
is  a  conceptual  problem  in  (29)  that  we  should  admit 
and  discuss. 

It  is  immediately  apparent  in  (29)  that  the  ratio  of 
posterior  probabilities  depends  multiplicatively  on 
the  ratio  of  prior  probabilities.  These  prior  probabi¬ 
lities  are  the  subjective  evaluations  of  the  scientist. 
Hence  critics  of  the  method  may  argue  that  it  is  not 
scientific;  because  for  a  procedure  to  be  scientific  it 
must  surely  be  objective. 

There  are  a  number  of  points  to  make  here,  but 
first  let  us  widen  the  discussion.  All  Bayesian  inference 
is  subjective.  The  posterior  distribution  always  depends 
to  some  extent  on  the  prior  distribution,  which 
represents  the  scientist’s  initial  subjective  beliefs.  Thus 
the  methods  presented  in  the  earlier  sections  of  this 
paper  are  just  as  liable  to  the  criticism  of  subjectivity. 
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We  admit  the  subjectivity  inherent  in  our  methods, 
but  do  not  see  it  as  a  failing. 

Categorising  very  broadly,  a  Bayesian  analysis  of 
a  problem  may  fall  into  one  of  three  classes.  First, 
there  might  be  total  agreement  amongst  the  scientific 
community  about  what  the  prior  should  be.  Although 
it  is  dangerous  to  claim  the  agreement  of  others,  we 
would  suggest  that  the  majority  of  crystallographers 
would  concur  with  the  use  of  Wilson’s  distributions 
in  (2)  and  (3)  and  would  thus  find  the  analysis  of  §1 
acceptable  and,  indeed,  scientific.  Second,  although 
there  may  be  disagreement  over  the  validity  of  certain 
prior  beliefs,  it  may  happen  that  the  data  are  so  strong 
that  they  dominate  the  analysis  and  lead  to  essentially 
the  same  posterior  distribution  whatever  reasonable 
prior  distribution  is  used.  Box  &  Taio,  (1973)  give  an 
example  of  this.  Third,  there  may  be  disagreement 
over  the  validity  of  certain  prior  beliefs  and  the  data 
may  be  weak.  In  this  case  the  posterior  distribution 
is  sensitive  to  the  choice  of  prior  distribution  and  there 
will  be  a  separate  Bayesian  analysis  appropriate  to 
each  member  of  the  scientific  community.  We  recom¬ 
mend,  therefore,  that,  when  there  is  no  obvious 
consensus  prior  distribution,  the  analysis  should  be 
carried  out  and  reported  for  a  range  of  prior  distri¬ 
butions.  In  that  manner  it  will  be  possible  to  see  how 
far  the  data  resolve  the  initial  disagreement  and, 
conversely,  how  far  they  leave  the  controversy  open. 
Thus  for  the  suggested  Bayesian  hypothesis  test  the 
scientist  should  report  the  range  of  Pm(M1)  for 
which  Pm(Mx  |  x)  >  0-95  (say). 

It  seems  to  us  that  the  explicit  subjectivity  of  the 
Bayesian  approach  is  an  advantage  in  that  it  quickly 
indicates  those  aspects  of  a  problem  in  which  the  data 
resolve  any  disagreement  and  those  aspects  where 
further  data  must  be  collected  before  any  agreement 
can  be  reached. 

We  are  grateful  to  many  people  for  their  encourage¬ 
ment,  advice,  criticism,  and,  not  the  least,  for  their 
C.S.— 4 
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D.  V.  Lindley  and  A.  J.  C.  Wilson,  and  Doctors  S.  R. 
Critchley,  R.  Diamond,  J.  S.  Rollett,  and  K.  S. 
Wilson,  all  who  have  helped  develop  the  ideas  that 
we  have  discussed.  All  members  of  the  laboratory  of 
Molecular  Biophysics,  past  and  present,  have  helped 
in  some  way  or  other,  and  to  them  we  offer  our  thanks. 
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Intensity  Statistics :  Survey,  Computer 
Simulation  and  the  Heavy-Atom  Problem 

By  Uri  Shmueli 

Department  of  Chemistry,  Tel- Aviv  University, 
Ramat  Aviv,  69978  Tel  Aviv,  Israel 


Abstract 

Recent  developments  in  intensity  statistics  that 
depend  explicitly  on  the  space-group  symmetry  of  the 
crystal  and  the  atomic  composition  of  the  asymmetric 
unit  are  illustrated,  reviewed  and  discussed.  The 
need  for  such  generalized  statistics  is  demonstrated 
by  the  results  of  a  simple  simulation  procedure 
which  deals  with  structure-factor-like  summations 
and  their  frequency  histograms.  These  simulation 
exercises  confirm  that  atomic  heterogeneity  may  give 
rise  to  serious  departures  from  the  ideal  (Gaussian) 
statistics  and  show  that,  under  such  circumstances, 
different  centrosymmetric  space  groups  give  rise  to 
entirely  different  intensity  distributions.  The  mathe¬ 
matical  background,  required  for  the  derivation  of  a 
unified  formalism  which  may  account  for  such  real 
distributions  and  still  be  conveniently  applicable,  is 
given  and  some  simple  but  representative  derivations 
are  presented.  This  is  followed  by  a  concise  but 
complete  summary  of  all  the  available  relevant 
expressions.  The  above  formalism  is  first  applied  to 
the  simulated  distributions,  which  are  satisfactorily 
accounted  for,  and  is  illustrated  by  its  application  to 
a  cumulative  distribution  recalculated  from  the  pub¬ 
lished  structure  of  a  triclinic  C6N404Cl2-platinum 
compound,  for  which  the  correct  space  group 
(PI)  was  rather  accurately  indicated.  Finally, 
the  representations  of  the  generalized  probability 
density  functions  as  Gram-Charlier-  and  Edgeworth 
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expansions  are  examined  with  regard  to  their 
convergence  properties.  It  is  concluded  that  the 
Gram-Charlier  form  is  to  be  preferred  for  practical 
applications,  at  least  until  more  terms  of  the  series 
become  available. 


1.  Introduction 

It  is  well-known  that  if  the  asymmetric  unit  of  a 
crystal  contains  an  outstandingly  heavy  atom  and  not 
too  many  light  ones,  a  situation  frequently  encounter¬ 
ed  in  organometallic  compounds  and  other  hetero¬ 
geneous  units,  the  distributions  and  moments  of 
integrated  intensity  usually  deviate  from  those 
predicted  by  the  Wilson  (1949)  statistics,  and  resolu¬ 
tion  of  space-group  ambiguities  with  the  aid  of  these 
ideal  statistics  often  becomes  difficult  or  impossible. 
The  extent  of  this  discrepancy  between  experimental 
and  ideal  statistics  is  also  symmetry-dependent,  and 
statistics  which  may  cope  with  such  situations  must 
therefore  take  into  account  both  the  chemical  compo¬ 
sition  of  the  asymmetric  unit  and  the  space-group 
symmetry  of  the  crystal. 

Probability  functions  satisfying  the  above  require¬ 
ments  were  first  given  by  Karle  &  Hauptman  (1953) 
and  Hauptman  &  Karle  (1953),  and  were  rederived 
and  discussed  by  other  authors  (see  Srinivasan  & 
Parthasarathy,  1976).  However,  to  the  author’s 
knowledge,  no  applications  of  these  generalized 
statistics  to  the  heavy-atom  problem  were  published 
in  the  1953-1976  period,  presumably  because  of  the 
apparent  complexity  of  the  corresponding  equations 
and  the  lack  of  a  convenient  method  whereby  space- 
group  symmetry  could  be  accounted  for,  especially 
in  the  case  of  space  groups  of  symmetries  higher  than 
the  orthorhombic. 

Recent  studies  of  non-ideal  intensity  statistics 
(Wilson,  1978;  Shmueli,  1979;  Shmueli  and  Kaldor, 
1981;  Shmueli  and  Wilson,  1981)  confirmed  the 
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earlier  results  and  led  to  a  considerable  simplification 
of  the  formalism  and  a  generalization  of  the  symmetry 
treatment.  A  subsequent  extension  of  the  theory, 
further  simplification  of  the  formalism  and  encourag¬ 
ing  applications  to  distributions  based  on  solved 
structures,  were  briefly  reported  (Shmueli,  1981a; 
Shmueli,  Kaldor  &  Wilson,  1981)  and  are  described 
in  detail  elsewhere  (Shmueli,  1982).  In  the  above 
applications  cumulative  distributions  of  normalized 
structure  amplitude,  recalculated  from  several  highly 
heterogeneous  asymmetric  units  (e.g.,  C6  N4  04 
Cl2  Pt,  PI,  Z  =  2),  were  compared  with  theoretical 
distributions  corresponding  to  the  known  composi¬ 
tions  and  possible  space  groups,  and  the  known  space 
groups  were  correctly  indicated. 

The  purpose  of  the  present  article  is  to  review  the 
statistical  and  crystallographic  principles  underlying 
these  generalized  distributions  and  to  discuss  the  results 
with  particular  attention  to  their  potential  appli¬ 
cability.  Section  2  introduces  the  heavy-atom  problem 
by  means  of  an  easy-to-follow  computer  simulation 
of  the  effects  of  atomic  heterogeneity  and  space-group 
symmetry  on  the  distribution  of  structure  factors. 
This,  previously  unpublished,  simulation  procedure 
appears  to  be  of  value  in  preliminary  considerations 
as  well  as  in  the  assessment  of  performance  of  the 
final  expressions.  In  section  3  generalized  distributions, 
presented  as  expansions  in  terms  of  the  ideal  distri¬ 
butions  (Wilson,  1949)  and  the  associated  orthogonal 
polynomials,  are  dealt  with.  A  brief  survey  of  the 
mathematical  principles  involved  (after  Cramer,  1951) 
is  followed  by  a  rederivation  of  the  fourth  moment  of 
the  normalized  structure  amplitude  |is|,  in  terms  of  the 
Wilson  (1978)  statistics  of  the  trigonometric  structure 
factor,  and  a  discussion  of  explicit  relationships  of 
the  latter  to  space-group  symmetry  operations  with 
particular  attention  to  computational  procedures 
(Shmueli  &  Kaldor,  1981).  This  section  is  concluded 
with  a  summary  of  expressions  for  probability  density 
functions  (p.d.f.),  even  moments  and  cumulative 
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distribution  functions  (c.d.f.)  of  |  E  |  which  depend  on 
the  space-group  symmetry  and  atomic  composition 
and  are  valid  for  structures  having  all  the  atoms  in 
general  positions,  with  no  conspicuous  non-crystallo- 
graphic  symmetry  and  negligible  effects  of  ano¬ 
malous  dispersion. 

The  section  4  considers  the  Gram-Charlier  and 
Edgeworth  arrangements  (Cramer,  1951)  of  the 
above  expansions,  with  regard  to  relationships 
between  the  atomic  heterogeneity  of  the  asymmetric 
unit  and  the  rate  of  convergence  of  these  series. 
These  considerations,  of  great  importance  in  practical 
applications,  are  aided  by  simulation  of  distributions 
corresponding  to  various  heterogeneities  and  it  is 
tentatively  concluded  that  the  Gram-Charlier  arrange¬ 
ment  should  be  adopted,  at  least  until  more  terms  of 
the  generalized  expansions  become  available  and  can 
be  evaluated. 


2.  A  simulation  of  the  heavy-atom  problem 

The  purpose  of  this  section  is  to  introduce  and 
illustrate  a  method  whereby  statistical  aspects  of 
intensity  distribution  in  a  diffraction  pattern  can  be 
conveniently  and  reliably  simulated.  Apart  from  its 
probable  didactic  value,  the  simulation  method  to  be 
described  affords  a  means  of  rapid  and  extensive 
numerical  tests  of  the  theory  with  regard  to  the  effects 
of  space-group  symmetry  and  atomic  heterogeneity 
on  intensity  statistics. 

Consider  two  routine  experiments  in  which  intensity 
data  were  collected  from  crystals  belonging  to  space 
groups  PI  and  Pmmm.  In  each  case  the  asymmetric 
unit  contains  24  atoms,  all  located  in  general  positions, 
and  3000  non-zero  reflexions  are  available  for  the 
structure  determination  of  each  crystal.  Let  us  further 
assume  that  there  is  no  pseudosymmetry  in  the 
structures  and  that  the  asymmetric  unit  of  each  contains 
one  outstandingly  heavy  atom  and  twenty-three 
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equal  lighter  ones,  the  atomic  number  of  the  heavy 
atom  being  fourteen  times  that  of  a  light  atom  in  each 
structure.  This  ratio  of  atomic  numbers  corresponds 
roughly  to  one  mercury  among  23  carbons,  which  is 
a  rather  highly  heterogeneous  composition. 

We  wish  to  simulate  the  frequency  distribution  of 
the  structure  amplitudes  \F\,  for  the  above  two 
experiments,  and  shall  first  consider  the  corresponding 
expressions  for  the  structure  factors.  These  are 


24 

Fm(hkl)  =  2  J  fj  cos  2n  (hXj+kyj+IZj) 
7=1 


and 


(0 


24 

F(2)(hkl )  —  8  ^  fj  cos  2vhxj  cos  2vkyj  cos  2vlzj. 
7  =  1 


(2) 


for  PI  and  Pmmm  respectively.  Since  all  the  atoms 
occupy  general  positions,  then  for  a  given  atomic 
position  XjyjZj  and  a  large  set  of  hkl  data,  the  frac¬ 
tional  parts  of  the  products  h  •  rj  =  hxj  +  kys  +  Izj 
[eq.(l)]  and  hxJf  kyJ}  lzj[e q.  (2)]  are  nearly  uniformly 
distributed  in  the  [0,1]  range.  It  is,  of  course,  most 
important  in  the  present  context  that  the  heavy  atoms 
be  located  in  general  positions. 

It  is  this  uniformity  which  imparts  to  the  atomic 
contribution  to  the  structure  factor  the  property  of  a 
random  variable  and, consequently, permits  us  to  regard 
the  structure  factor  as  a  sum  of  random  variables. 
Had  we  also  assumed  that  all  the  atoms  have  the  same 
or  similar  scattering  factors  (the  ‘equal-atom’  case), 
we  would  be  dealing  with  sums  of  independent, 
random  variables,  all  having  the  same  distribution, 
and  the  strongest  central  limit  theorem  (due  to  Linde- 
berg  and  L6vy,  see  Cramer,  1951),  which  predicts  a 
normal  distribution  of  such  a  sum,  would  be  directly 
applicable,  as  is  known  from  the  theory  and  practice 
of  the  Wilson  (1949)  statistics.  There  are  weaker 
central-limit  theorems  which  predict  asymptotic 
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normality  of  sums  of  differently  distributed  variables, 
however,  the  additional  conditions  they  impose 
reduce  the  probability  that  an  individual  term  will 
have  a  relatively  large  contribution  to  the  value  of  the 
sum  (Cramer,  1951).  Ani  ncreased  number  of  terms 
clearly  achieves  this  purpose.  For  example,  an  out¬ 
standingly  heavy  atom  attached  to  a  protein  molecule 
is  not  expected  to  give  rise  to  serious  departures 
from  the  ideal  statistics,  unlike  the  situation  in  small- 
molecule  structures. 

The  required  distributions  can  be  most  simply 
simulated  by  replacing  hxj  +  kyj  +  Izj,  hx},  kyj 
and  Izj  by  computer-generated  pseudo-random 
numbers  p,  q,  r  and  s  respectively,  also  uniform 
in  the  [0,  1]  range,  and  rewriting  (1)  and  (2)  as 

24 

^(i)  =  I  cij  cos  2tt p  (3) 

7  =  1 

and 

24 

A(2)  =  cij  cos  2t rq  cos  2-nr  cos  2tts  (4) 
7  =  1 

respectively,  omitting  the  numerical  constants.  The 
composition  is  simulated  by  setting  ax  —  14  and 
cij  —  1  fory  ^  1.  However,  in  order  to  test  the  validity 
of  the  simulation  procedure,  distributions  for  the 
equal-atom  case  (all  d/s  equal  unity)  will  also  be 
constructed.  The  following  procedure  leads  to  the 
required  result. 

1.  Compute  the  mean  (A),  the  absolute  deviations 
from  the  mean,  Ak  =  |  Ak  —  (A)  |  ,  k  —  1,  3000  and 
the  variance,  a2  =  (A2),  of  the  distribution  of  these 
deviations.  Next,  construct  a  histogram  of  the  A’s 
in  the  [0,  3cr]  range,  collecting  their  frequency  counts 
in  thirty  channels,  each  a/ 10  wide. 

The  deviations  A  correspond  to  the  magnitudes  of 
the  structure  factors  and  the  histogram  corresponds 
to  their  frequency  distribution. 
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2.  Compare  the  histogram  with  a  scaled  up  ideal 
p.d.f.  (the  centric  one,  in  this  example) 


/>!«'  (A)  =  Kg  (A), 


(5) 


where  *W-(^)“«p(-^)  W 

and  the  scale  factor  K  can  be  estimated  as 


K  = 


30  30 


2  A  (A,)/  £  g(A,), 
/=!  i=l 


(7) 


where  h  (A,)  is  the  histogram  count  in  the  channel 
centred  at  A*.  Note  that  the  variance  of  the  simulated 
distribution  h  (A)  is  used  for  the  construction  of  the 
p.d.f.  (6)  (or  any  other  p.d.f.  to  be  tested). 

The  extent  of  discrepancy  between  the  histogram 
and  the  scaled  p.d.f.  (5)  can  be  expressed  by  an  R 
factor,  defined  as 

30  30 

R'pc>  =  {  I  [A  (A,)  -  pf  (A,)]*/  2  »  (A|)}1/a.  (8) 

i=l  i=  1 


3.  The  ‘experimental’  cumulative  distribution,  at 
the  endpoints  of  the  histogram  channels,  can  be 
computed  as 

k 

JV„(A,)=  ^  A(A,)/3000  (9) 

i- 1 

and  can  be  directly  compared  with  the  ideal  centric 
and  acentric  c.d.f.’s 


N{c 0)  (A)  =  erf  [A/(«rV2)]  (10) 

and 

iV<0)  (A)  =  1  -  exp  (-  A2/ct2)  (11) 

based  on  the  Wilson  (1949)  statistics.  Of  course,  (9) 
can  be  compared  with  other  appropriate  c.d.f.’s  and 
an  R  factor  analogous  to  (8)  can  be  evaluated. 
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The  results  of  our  simulation  example  are  contained 
in  Figs.  1  and  2,  and  some  expected  and  computed 
statistics,  as  well  as  the  R  factors  for  the  histograms 
vs.  ideal  centric  p.d.f.’s  (5)  and  the  cumulative  distri¬ 
butions  (9)  vs.  (10)  and  (11),  are  given  in  Table  1. 
Figs.  1  and  2  also  contain  the  generalized  p.d.f.’s  and 
c.d.f.’s  (dashed  curves),  to  be  given  in  the  next  section, 
and  will  be  referred  to  later. 

Figs.  1(a)  and  1(h)  show  the  distributions  obtained 
for  the  equal-atom  case.  The  histograms  of  PI  [Fig. 
1(a)]  and  Pmmm  [Fig.  1(h)]  agree  well  with  the 
scaled-up  Gaussians,  and  differences  due  to  different 
symmetries,  or  different  functional  forms  of  (3)  and 
(4),  appear  to  be  unimportant.  This  was  expected 
since  each  of  (3)  and  (4)  satisfies  the  assumptions  of 
the  Lindeberg-Levy  central  limit  theorem  (Cramer, 
1951)  and  these  sums  should  be  approximately 
normally  distributed,  provided  the  computer-gene¬ 
rated  pseudo-random  numbers  are  random  enough 
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Fig.  1.  Comparison  of  simulated  histograms  with  ideal  and 
non-ideal  probability  density  functions. 

The  height  of  each  rectangle  equals  the  number  of  absolute 
deviations  A  which  lie  within  the  corresponding  channel.  The 
ideal  centric  p.d.f.’s  (Wilson,  1949)  are  denoted  by  solid  lines 
and  the  non-ideal  p.d.f.’s  evaluated  from  a  P( A)  version  of 
equation  (35),  are  given  by  the  dashed  curves.  The  labels 
n  =  2  etc.  denote  the  number  of  terms  of  the  expansion  (35) 
which  were  used  in  the  construction  of  a  p.d.f.  All  the  p.d.f.’s 
are  scaled  up  to  the  histogram.  The  variance  a 2  of  the  histo¬ 
gram  is  used  as  the  distribution  parameter  in  the  ideal  and 
non-ideal  p.d.f.’s  shown. 

(a)  PI  —  the  equal-atom  case,  (b)  Pmmm  —  the  equal-atom 
case,  (c)  PI  —  the  one-heavy-atom  case,  (d)  Pmmm  —  the 
one-heavy-atom  case. 

and  are  sufficiently  independent  when  generated  in  a 
contiguous  sequence  (a  single  run).  The  extent  of 
departure  from  these  ideal  conditions  is  rather  small, 
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as  can  be  judged  from  the  values  of  (A)  and  (A) ja 
which  should,  of  course,  be  zero  (cf.  Table  1),  and 
the  discrepancy  between  the  computed  and  expected 

24 

values  of  a.  The  latter  should  equal  (£  ^  aj)m  and 

j=  1 
24 

(§  a])ln  for  P]  and  Pmnam  respectively.  The  validity 

7= 1 

of  this  simulation  procedure  is  thus  supported  by 
the  results  shown  in  Figs.  1(a)  and  1(6)  and  in  the 
‘equal-atom’  column  in  Table  1. 

Large  departures  from  normality  are  seen  in  the 
results  of  the  simulations  for  the  heavy-atom  case 
[Figs.  1(c)  and  \{d)  and  Figs.  2(a)  and  2(6)].  The 
departure  is  much  more  serious  for  P]  (R(f  —  0-394) 
than  it  is  for  Pmmm  (Rf  =  0-212),  and  the  over¬ 
all  shapes  of  these  distributions  are  qualitatively 

Table  1.  Statistics  of  simulated  histograms  and 
comparison  with  ideal  p.d.f.'s  and  c.d.f's. 

equal-atom  heavy-atom 


P\ 

Pmmm 

PI 

Pmmm 

<^> 

0-0307 

-0-0163 

0-0432 

-0-0778 

a 

3-4160 

1-7713 

10-4454 

5-1954 

aexp 

3-4641 

1-7321 

10-4642 

5-2321 

< Ay/a 

0-0090 

-0-0092 

0-0041 

-0-0150 

Rf 

0-063 

0-045 

0-394 

0-212 

jjU) 

kn 

0-011 

0-005 

0-182 

0-076 

»0) 

Kn 

0186 

0-175 

0-079 

0-241 

The  results  refer  to  calculations  described  in  text  and  dis¬ 
played  in  Figs.  1  and  2.  The  means  < A >  are  obtained  as 
3000  3000 

£  zl(j)/3000  and  £  AffOOO  for  PI  and  Pmmm  respec- 

i=i  i'=l 

tively,  and  <7  is  taken  as  <A2>1/2,  as  defined  in  text.  For 
definitions  of  aexp  (expected) — see  text.  The  R  factors  are: 

iff  — as  defined  in  (8),  iff  — simulated  cumulative  distri¬ 
bution  vs.  equation  (10)  and  R(f  — simulated  cumulative 
distribution  vs.  (11) 


64  Uri  Shmueli 


Fig.  2.  Comparison  of  cumulative  distributions  from  the  simu¬ 
lated  histograms  with  ideal  and  non-ideal  c.d.f.'s. 

The  circles  denote  simulated  c.d.f.’s  obtained  from  equation 
(9),  the  dashed  lines  correspond  to  equation  (46)  with  number 
of  terms  indicated  in  the  figure  and  the  solid  curves  C  and  A 
are  the  ideal  centric  and  acentric  c.d.f.’s  respectively.  The 
latter  are  obtained  by  plotting  the  r.h.s.  of  (10)  and  (11)  vs. 
|£|  =  Mo. 

(a)  PI  —  the  one-heavy-atom  case,  ( b )  Pmtnm  —  the  one- 
heavy-atom  case. 
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different.  Thus,  for  one  heavy  atom  in  P],  the  fre¬ 
quency  (probability)  of  A  is  at  its  maximum  for 
A  tn  a,  it  is  appreciably  smaller  for  A  near  zero  and 
it  goes  down  to  zero  for  A  >  2-3  a.  Since  A  simulates 
|  F| ,  the  above  description  has  very  much  in  common 
with  that  of  the  usual  distribution  of  structure  ampli¬ 
tudes  for  a  non-centrosymmetric  crystal ;  in  fact,  the 
cumulative  distribution  for  the  P\  histogram  is  seen 
to  be  rather  close  to  the  ideal  acentric  c.d.f.  [Fig. 
2(a),  Table  1] .  This  is  consistent  with  the  well-known 
observation  that  in  presence  of  a  heavy  atom  in 
P],  the  distribution  ‘looks  more  like  an  acentric 
one’. 

The  high  probability  for  weak  intensities  (small 
|  F|’s  or  A’s),  characteristic  of  the  ideal  centric  distri¬ 
bution,  is  more  accentuated  in  the  simulated  histo¬ 
gram  for  Pmmm  in  the  heavy-atom  case,  at  the 
expense  of  frequencies  of  A  in  the  intermediate  region 
[Fig.  1(d)].  This  distribution  is  reminiscent  of  a 
hypercentric  one  and  the  same  impression  is  given  by 
the  corresponding  cumulative  distribution,  which  is 
displaced  to  the  hypercentric  side  of  the  ideal  centric 
c.d.f.  [Fig.  2(b)] .  ' 

The  important  role  of  space-group  symmetry  in 
effects  of  atomic  heterogeneity  on  intensity  statistics 
has  thus  been  demonstrated  for  the  above  two  space 
groups.  Such  simulations  have  also  been  carried  out 
for  other  centrosymmetric  and  non-centrosymmetric 
space  groups  and  a  variety  of  different  distributions 
was  obtained  in  the  heavy-atom  case.  For  the  equal- 
atom  case,  all  distributions  agree  well  with  the  corres¬ 
ponding  ideal  ones,  as  in  the  present  example.  Several 
variations  of  composition  were  also  tried  and  these 
can  be  summarized  as  follows:  (1)  an  increase  of  the 
number  of  light  atoms,  for  one  heavy  atom  and  a 
fixed  Zheavy/Zijght  ratio,  and  a  decrease  of  this  ratio, 
for  a  fixed  number  of  atoms  in  the  asymmetric 
unit,  both  result  in  an  improved  agreement  of  the 
simulated  statistics  with  the  appropriate  ideal  one, 
and  (2)  the  most  problematic  case  appears  to  be  that 
C.  S.— 5 
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of  one  outstandingly  heavy  atom  among  not  too 
many  light  ones;  the  presence  of  two  heavy  atoms 
in  the  asymmetric  unit  leads  to  a  remarkable  decrease 
in  the  departure  from  the  ideal  statistics  and  if  three 
or  more  equal  heavy  atoms  are  present,  the  ideal 
statistics  usually  indicates  the  correct  space  group 
quite  reliably,  even  in  such  an  unfavourable  space 
group  as  P"l  and  for  the  same  heterogeneity  as  the 
one  assumed  here  (i.e.  ax  —  14). 

Another  important  application  of  such  simulations 
is  their  use  in  testing  theories  that  are  supposed  to 
explain  the  departures  from  the  ideal  statistics  in 
terms  of  atomic  composition  and  space-group  sym¬ 
metry,  i.e.  the  causes  of  such  departures.  The  theory 
given  in  the  next  section  results  in  truncated  series  for 
the  probability  density  functions  and  a  most  pertinent 
practical  question,  namely,  what  is  the  minimum 
number  of  terms  which  are  likely  to  give  a  correct 
representation  of  the  non-ideal  distribution  (or:  how 
many  terms  must  be  evaluated  ?)  is  examined  with  the 
aid  of  such  methods  in  the  last  section. 


3.  Generalized  intensity  statistics 

The  heavy-atom  problem,  illustrated  by  the  above 
simulations,  is  an  example  of  a  situation  which  cannot 
be  satisfactorily  accounted  for  in  terms  of  a  ‘universal’ 
distribution  law,  in  our  case  the  normal  distribution. 
The  departures  from  normality  depend,  as  illustrated 
above,  on  the  composition  of  the  asymmetric  unit  as 
well  as  on  the  space-group  symmetry  of  the  crystal 
and  hence  the  required  non-ideal  or  generalized 
distribution  must  take  these  factors  into  account. 
Clearly,  statistical  as  well  as  crystallographic  con¬ 
siderations  are  needed  for  the  derivation  of  such 
distributions  which  will,  of  course,  differ  for  centro- 
symmetric  and  non-centrosymmetric  space  groups. 

A  device  of  mathematical  statistics,  which  often 
permits  an  easy  derivation  of  generalized  distributions, 
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without  the  necessity  of  proceeding  from  first  princi¬ 
ples,  is  the  method  of  expansions  in  terms  of  ortho¬ 
gonal-polynomials  (e.g.,  Cramer,  1951).  If  the  required 
distribution  is  a  generalization  of  a  known  one  and 
the  additional  factors  which  are  introduced  lead  to 
departures  from  the  known  distribution,  we  may  try 
to  expand  the  required  function  in  terms  of  the 
known  one,  polynomials  in  our  variable  and  coeffi¬ 
cients  which  depend  explicitly  on  the  new  factors 
which  require  this  generalization,  i.e.  problem- 
dependent  coefficients.  Such  a  representation  of  the 
required  probability  density  function  P(x),  which 
departs  from  a  given  p.d.f.  Pi0)(x),  due  to  factors  not 
accounted  for  by  the  latter,  is  given  by 

P(x)  =  ][  Ckfk(x)  P<°*(*),  (12) 

k 

(Cramer,  1951),  where  ck  are  the  problem-dependent 
coefficients  and  fk(x )  are  polynomials,  associated 
with  Pi0)(x)  by  the  orthogonality  relationship 

(13) 

Such  polynomials  are  known  as  orthogonal  with 
respect  to  the  weight  function  P(0)(x)  (e.g.,  Abramo- 
witz  and  Stegun,  1972),  and  their  association  with 
the  weight  function  is  unique.  Thus,  e.g.,  if  P{0)(x)  is 
a  Gaussian,  the  associated  fk  s  are  Hermite  polyno¬ 
mials  and  if  P(0)(x)  is  of  the  form  exp  (—  x),  it  is 
associated  with  Laguerre  polynomials  (Cramer,  1951; 
these  and  other  examples  are  to  be  found  in  Abramo- 
witz  and  Stegun,  1972).  The  above  Pi0)  functions 
corres  pond  to  the  ideal  centric  and  acentric  p.d.f.’s 
and  these  examples  are  therefore  relevant  to  our 
applications. 

The  choice  of  orthogonal  polynomials  for  such 
expansions  has  several  advantages,  one  of  them 
being  related  to  the  problem-dependent  coefficients. 
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Multiplying  both  sides  of  (12)  by  fm(x )  and  inte¬ 
grating  them,  assuming  that  a  term-by-term  integra¬ 
tion  is  permissible,  we  obtain 

c-  =  /» to  dx-  04) 

where  use  has  been  made  of  the  orthogonality  rela¬ 
tionship  (13).  The  problem-dependent  coefficients 
ck  in  (12)  are  thus  expectation  values  of  the  corres¬ 
ponding  orthogonal  polynomials  and  since  the  latter 
are  of  the  form  fk(x )  =  E  a{„k)xn,  the  required 
coefficients  must  be  n 

Q  =  I  <*">.  05) 

n 

where  a(nk)  are  the  same  coefficients  which  appear  in 
the  polynomials  fk(x),  and  <xn)>  are  moments  of 
the  distribution  with  the  density  function  P(x),  i.e. 
moments  of  x  which  contain  the  required  problem 
dependence. 

The  crystallographic  considerations  needed  for  the 
evaluation  of  the  required  moments  are  illustrated 
by  a  derivation  of  the  second  moment  of  the  inte¬ 
grated  and  reduced  intensity,  (F|2.  The  derivation 
follows  closely  that  given  by  Wilson  (1978).  The 
structure  factor  is  given  by 

F(h)  =  J/j/,(h),  (16) 

j 

where  fj  is  the  atomic  scattering  factor  and 

Jj  (h)  =  ^  exP  P77*  h?  (ps  rj+ts)]»  (17) 

is  the  trigonometric  structure  factor  of  the  y'th  atom, 
the  summation  in  (16)  extends  over  all  the  atoms  in 
the  asymmetric  unit  and  that  in  (17)  ranges  over  all 
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the  space-group  operations  (Ps  ( ts)  which  transform 
the  position  in  which  the  yth  atom  is  located  to  its 
symmetry-equivalent  positions.  In  what  follows,  only 
the  set  of  general  positions  will  be  considered  and 
effects  of  dispersion  will  be  neglected.  (The  latter 
effects  are  allowed  for  in  Wilson’s  original  derivation.) 
The  second  moment  of  (corrected)  intensity  \F\ 2,  or 
the  fourth  moment  of  |  F  | ,  can  thus  be  written  as 

<|F1‘>  =  <(FF*)*>  =  211 

abed 

<■ KJtJcJt >•  08) 

According  to  the  Wilson  (1978)  statistics  of  the 
trigonometric  structure  factor,  (i)  the  average 
(Ja  J*)  —  <  \Ja  |2)»  taken  over  a  large  set  of 
reflections,  equals  the  multiplicity  of  the  Wyckoff 
position  in  which  the  a\h  atom  is  located  (here,  the 
order  of  the  point  group),  which  we  denote  by  pa, 
(ii)  the  averages  (Ja  Jb)>  and  (Ja  J* ),  with  a  ^  b, 
vanish  for  centrosymmetric  and  non-centrosymmetric 
space  groups,  and  (iii)  the  average  (Ja  Ja )  vanishes 
for  non-centrosymmetric  space  groups  and  equals 
pa  for  the  centrosymmetric  ones  since  Ja  =  J*  in  the 
latter  case  and  (iii)  reduces  to  (i). 

It  follows  that  the  non- vanishing  terms  in  (18) 
must  contain  even  moments  of  |  Ja  |  and  thus 

<|*14>  =£  ^  P->,‘+  iKl’ 

a  #  b  a 

where  qa  =  (|/a  |4).  Following  condition  (iii)  of  the 
J  statistics,  it  can  be  readily  verified  that  the  multipli¬ 
city  L  of  the  double  summation  on  the  r.h.s.  of  (19) 
equals  3  or  2  according  as  the  space  group  is 
centrosymmetric  or  non-centrosymmetric  respectively. 
Making  use  of  the  identity 

2  2  =  (I**)2 

a  b  a  a 


(20) 
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and  dropping  the  subscripts  from  the  moments  of  |  / 1 
(this  is  possible  since  all  the  atoms  occupy  general 
positions)  we  obtain 

<lfl“>  =£/>2  (2/.)s  +  (9  -  -V)  2  f"  <2>) 

(Wilson,  1978). 

The  fourth  moment  of  the  normalized  structure 
amplitude  \  E\  =  \  F\  /  (  |  T|2)1/2  is  obtained  upon 
dividing  both  sides  of  (21)  by 

(22) 

a  a 

It  follows  that 

<|£|4>=L  +  (y4 (23) 

where 

Yt  =  <K|‘>/<|^|*> 

and 

e«  =  If'Al  /;)*• 

a  a 

Higher  moments  of  |2i|  are  derived  along  similar 
lines.  A  unified  derivation  of  (|£’|4),  (l^l6)  and 
<|£'|8)>  was  presented  by  Shmueli  and  Wilson 
(1981)  and  an  extension  to  (|  i?!10)  was  carried  out 
by  Shmueli  (1982).  Since  for  the  2«th  moment  a 
2n-fold  summation  analogous  to  (18),  must  be 
considered,  the  algebraic  complexity  of  the  deriva¬ 
tion  increases  very  quickly  with  increasing  order  of 
the  moment.  However,  this  effort  is  justified  by  the 
fact  that  each  additional  moment  permits  to  add  a 
(calculable)  term  to  the  generalized  p.d.f.’s,  based  on 
(12),  and  thus  to  overcome  possible  problems  related 
to  the  convergence  of  the  latter  to  the  observed  (or 
simulated)  p.d.f.  (see  below). 

The  space-group  dependence  of  the  moments  is 
contained  in  the  even  moments  of  the  trigonometric 
structure  factor  [cf  eqs.  (23)  and  (17)].  Since  the  real 
and  imaginary  parts  of  J  are  listed  for  all  the  space 
groups  and  all  the  hkl  subsets  leading  to  different 
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functional  forms  of  J  ( International  Tables ,  1952) 
these  moments  can  be  evaluated  by  a  straightforward 
averaging  of  the  resulting  trigonometric  polynomials 
and  their  powers.  Such  a  calculation  of  (|  /|4)  was 
done  by  Wilson  (1978)  for  all  the  space  groups  but  two 
(Fd3m  and  Fd3c,  p  =  192)  and  thus  made  possible 
the  evaluation  of  the  fourth  moment  of  [  E  | .  How¬ 
ever,  a  straightforward  calculation  of  moments  of  |  / 1 
higher  than  the  fourth  appeared  to  be  too  cumbersome 
and  a  computer  algorithm  was  constructed  which 
yielded  the  values  of  (  )  and  (  /6  )  for  all  the 
centred  tetragonal  (with  p  =  32)  and  cubic  space 
groups,  starting  from  the  usual  trigonometric  expres¬ 
sions  (Shmueli  &  Kaldor,  1981).  The  eighth  moment 
of  |  /  j  can  also  be  computed  for  those  space  groups 
with  the  above  algorithm.  The  remaining  space  groups 
were  dealt  with  using  an  algorithm  to  be  described 
below,  taking  (|/|4)  as  an  example.  We  have,  using 
equation  (17), 

<|7|*>=  <(■//*)*> 

(exp  [i(<f>stuv  +  ®stuv)\ ) ,  (24) 

where 

tstuv  =  2”hT  ( Ps  -Pt+Pu-  Pv) r  (25) 

and 

estuv  =  2*hT(ts  —  tf  +  tu  -  t„)  (26) 

(Shmueli  and  Kaldor,  1981). 

As  shown  by  these  authors,  if  r  is  a  general  position 
then  the  condition  to  be  fulfilled  by  a  non-vanishing 
term  in  (24)  is  that  <f>stuv  be  identically  zero  and  the 
value  of  such  a  term  is  given  by  exp  ( iOstuv )•  This  can 
be  so  only  if 

Ps  -  Pt  +  P«  -Pv  =  0  (27) 

and  a  two-step  algorithm  follows:  (i)  find  stuv  for 
which  (27)  holds  true  and  (ii)  accumulate  the  corres¬ 
ponding  value  of  exp  (i6stuv).  The  extension  of  the 
above  to  any  even  order  is  quite  straightforward  but 


72  Uri  Shmueli 


this  algorithm  is  less  efficient  than  the  former  one 
when  dealing  with  space  groups  for  which  p  exceeds 
24  (Shmueli  &  Kaldor,  1981). 

Equation  (27)  and  its  higher  analogs  permit  an 
easy  evaluation  of  the  lower  limits  of  <(J/|2n)  for 
all  the  space  groups.  Thus,  (27)  must  hold  true  if 
(/)  s  =  t  —  u  =  v  (there  are  p  such  terms)  and 
(«)  s  =  t  #  u  —  v  or  s  —  v  i=-  t  =  u  [2 p  (p  —  1) 
terms] .  Equation  (24)  thus  contains  at  least  p  +  2p 
(p  —  1)  =  2 p2  —  p  non-vanishing  terms  and  the 
lower  limit  of  (  |  J  4)  is  just  this  number.  Defining 

=  <p|“>/<  pl!>  (28) 

and  making  use  of  the  fact  that  (  |/|2)  always 
equals  p,  the  order  of  the  point  group  times  the  lattice 
multiplicity,  we  obtain,  from  similar  considerations, 
the  following  lower  limits. 


>2-1, 

(29) 

P 

9  4 

6-1+1, 

(30) 

P  P " 

72  82  33 

(31) 

P  P2  P 

1250  1225  .  456 

■  o  ~  O  +  4  • 

(32) 

It  is  interesting  that  the  above  inequalities  become 
equalities  for  space  groups  with  p  <  4  as  well  as  for 
some  others.  This  empirical  observation  is  consistent 
with  a  similar  remark  made  by  Wilson  (1978)  in 
connection  with  the  relation  between  the  value  of 
<  j/|4)  and  the  order  of  the  point  group. 

It  should  be  noted  that  the  trigonometric  structure 
factors  for  triclinic,  monoclinic  and  orthorhombic 
space  groups  (except  Fdd2  and  Fddd)  are  simple 
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enough  to  yield  closed  expressions  for  y2k  by  direct 
averaging,  e.g.,  for  PI  we  have 


J  —  2  cos  2 rr  ( hx  +  ky  +  Iz)  =  2  cos  a  (33) 


and 

<|/|*">  =  22n  <cos2na>, 

=  22n  (2ii-1)11 
(2/2)!!  ’ 

(e.g.,  Abramowitz  and  Stegun,  1972). 
p  —  {  |  /|2)  =  2  for  Pf,  we  have 

(2/i  -  1)!! 
y  2n  j  * 

n\ 


Since 


(34) 


A  list  of  such  expressions  for  the  low-symmetry  space 
groups  has  been  given  by  Shmueli  (1982). 

We  now  present  the  equations  of  generalized  inten¬ 
sity  statistics  which  account  for  an  arbitrary  atomic 
heterogeneity  and  any  space-group  symmetry;  it  is 
assumed  that  (i)  all  the  atoms  are  located  in  general 
positions,  (ii)  there  is  no  pseudosymmetry  or  other 
dependence  in  the  structure,  and  (iii)  effects  of  dis¬ 
persion  are  negligible.  The  probability  density  func¬ 
tions,  to  be  given  below,  were  constructed  from 
moments  of  |P|,  derived  as  outlined  above  (Wilson, 
1978;  Shmueli  &  Wilson,  1981;  Shmueli,  1982), 
with  the  aid  of  the  expansion  method  described  at  the 
beginning  of  this  section. 

The  probability  density  functions  of  |  E  |  are  given 
by 

PC(|P|)  =  P«°)(|P|) 


and 


i'+l 


H  k 


2k(2k) ! 


H, 


2  k 


/’„(|£|)=P<°>(|£|) 


(35) 
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for  centrosymmetric  and  non-centrosymmetric  space 
groups  respectively,  where 


C  ( I E  | )  =  (?)W  exp  (  -  (I!!)  and  P»>  ( |  E  | ) 

=  2|£|exp(-|£|»)  (37) 

are  the  ideal  centric  and  acentric  p.d.f.’s  of  |  E\,  based 
on  the  Wilson  (1949)  statistics,  respectively,  the 
coefficients  A2k  and  B2k  depend  via  the  even  moments 
of  |£|  on  the  composition  of  the  asymmetric  unit  and 
space-group  symmetry  of  the  crystal  and  H2k  and 
Lk  are  Hermite  and  Laguerre  polynomials  as  defined 
and  tabulated  by  Abramowitz  and  Stegun  (1972). 

The  currently  available  expressions  for  the  coeffi¬ 
cients  are 

zf4  or  £4  =  a4  <24,  (38) 

A6  or  BG  =  aG  Qg,  (39) 

As  or  B8^asQs+U  (a\  Q\  -  y\  Q8),  (40) 

Al0  or  B10  —  a10  Q10  -f-  Vy*  Q10 

+  W(aA  aG  04  Qg  -  y4  y6  Qi0)  (41) 

with 

k 

a2 k  =  ^  (-l)ft_p  (k-p)\  akp  y2p  +  (-l)fc-1 

p=2 

(k-l)\ak0,  (42) 

where 


«fcP  = 


tkUlk  -  1)!! 
\Pl(2p  -  1)!! 


or 


(43) 


with  (2k  —  1)!!  =  (2k)\/(2k  k\),  according  as  the 
space  group  is  centrosymmetric  or  non-centrosym¬ 
metric,  and 


m 


m 


Q 


2  k 


I  rr/(lff)k 

/— 1  7=1 


(44) 
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where  m  is  the  number  of  atoms  in  the  asymmetric 
unit  and  f,  are  their  scattering  factors.  The  space- 
group  constants  y2p  are  defined  as  in  (28)  and  the 
constants  U,  V  and  W  appearing  in  (40)  and  (41)  are 
given  by  35,  3150  and  210  and  18,  900  and  100  for 
centrosymmetric  (A2k)  and  non-centrosymmetric  (B2k) 
space  groups  respectively.  The  first  five  terms  of  (35) 
and  (36)  can  thus  be  evaluated. 

The  even  moments  of  \E\  are  related  to  the 
coefficients  A2k  and  B2k,  as  defined  by  (38)  -  (41),  by 

k 

<  I  £( 2fc  >  =  «*„  +  £  akp  A2p  (or  B2p),  (45) 

P=  2 

where  akp  is  defined  by  (40),  and  the  cumulative 
distributions  of  f  jET  f  are  obtained,  by  direct  integration 
of  (35)  and  (36)  as 

n 

2  [2*  (2*)!]  (46) 

k=2 

and 

n 

^(|£|)=l-exp(|£p)+exp(-|£|a)  £  (-1  YB» 

k= 2 

[Wl^f2)  -  Lki\E\>)\lk\,  (47) 

for  centrosymmetric  and  non-centrosymmetric  space 
groups  respectively  (Shmueli,  1982). 


4.  Discussion 

The  probability  density  and  cumulative  distribution 
functions,  given  above,  are  formally  convergent 
expansions  (Shmueli  and  Wilson,  1981)  but  they  are 
available  for  use  as  truncated  series,  so  far  containing 
at  most  their  first  five  terms.  From  a  practical  point 
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of  view,  it  is  thus  important  to  know  whether  the 
available  terms  satisfactorily  represent  the  actual 
distributions,  under  circumstances  which  may  require 
the  use  of  non-ideal  statistics.  It  is  also  of  interest  to 
examine  the  rate  of  convergence,  as  a  function  of 
atomic  composition  of  the  asymmetric  unit  and 
space-group  symmetry  of  the  crystal,  as  this  may 
indicate  how  many  terms  must  be  included  in  order 
to  achieve  a  correct  representation. 

Before  proceeding  with  the  latter  question,  let  us 
return  to  the  simulation  exercise  described  above  and 
examine  the  comparisons  of  equations  (35)  and  (46), 
as  they  stand,  with  the  simulated  histograms  and  their 
cumulative  distributions  respectively  [Figs.  1  and  2j. 
In  each  case  four  curves,  corresponding  to  two-, 
three-,  four-  and  five-term  expansions,  are  plotted 
along  with  the  histogram  and  the  ideal  distribution. 

It  is  seen  from  Figs.  \{a)  and  1(6)  that  in  the  equal- 
atom  case  the  different  truncated  expansions  nearly 
coalesce  and  are  remarkably  close  to  the  ideal  centric 
p.d.f.’s,  for  both  space  groups.  This  was  expected 
since  the  A2k  coefficients  tend  to  zero  as  the  number 
of  equal  atoms  tends  to  infinity,  and  the  small  differ¬ 
ences  between  the  non-ideal  and  ideal  p.d.f.’s  reflect 
the  effects  of  space-group  symmetry  in  the  presence 
of  an  asymmetric  unit  of  finite  size. 

The  heavy-atom  histogram  for  PI,  and  the  corres¬ 
ponding  c.d.f.,  [Figs.  1(c)  and  2(a)]  are  satisfactorily 
explained  by  four-term  and  five-term  expansions. 
The  three-term  series  is  the  shortest  to  account  semi- 
quantitatively  for  the  pseudo-acentric  hump  in  the 
histogram  and  the  first  two  terms  are  clearly  insuffi¬ 
cient  for  this  purpose.  Convergence  of  the  PI 
expansions  appears  to  be  very  slow  up  to  the  fourth 
term  but  improves  at  the  fifth  one.  Whether  or  not 
the  convergence  continues  to  improve  with  an 
additional  term  is  not  yet  clear  but  this  seems  to  be 
a  worthwhile  check,  even  if  six-term  expansions  will 
never  be  used  in  practical  applications. 
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Comparisons  of  the  various  expansions  with  the 
somewhat  ‘hypercentric’  Pmmm  histogram  and  its 
c.d.f.  are  given  in  Figs.  1  (d)  and  2(b).  As  above,  four- 
and  five-term  expansions  explain  the  simulated 
histogram  quite  well,  the  three-term  expansion  is  the 
shortest  to  account  for  the  excess  of  very  small  and 
very  large  values  of  A,  albeit  very  approximately, 
and  the  two-term  expansion  appears  to  be  too  short 
for  this  level  of  heterogeneity. 

The  different  shapes  of  the  non-ideal  p.d.f.’s  for 
PI  and  Pmmm  are  due  to  the  different  signs  of  the 
symmetry  terms  a2k,  as  well  as  their  different  magni¬ 
tudes.  Thus,  e.g.,  a4  =  —  1-5  and  0-375,  a6  =  10 
and  —  5  for  PI  and  Pmmm  respectively  and  similar 
relationships  hold  for  higher-symmetry  terms.  It 
therefore  follows  that  Pmmm  expansions  are  affected 
by  the  heavy  atom  to  a  lesser  extent  than  those  for 
PI,  in  accordance  with  the  histogram  and  analogous 
distributions  recalculated  from  solved  structures 
(Shmueli,  19816).  However,  the  rate  of  convergence 
is  similar  for  the  two  space  groups  and  is  therefore 
dependent  mainly  on  the  atomic  composition  of  the 
asymmetric  unit. 

Let  us  examine  the  composition  dependent  term 
Q2k  for  an  asymmetric  unit  containing  /  carbons  and 
r  equal  atoms  of  type  X.  We  have  from  (44) 


=  Iff  +  rff  =  l  +  rj* 

(Ifi  +  rfff  (t+r/ff 

where  p  =  fjfc .  For  the  present  considerations  we 
may  replace  the  scattering  factors  by  atomic  numbers, 
i.e.,  p  ~  ZJZC,  but  this  is  not  recommended  or 
needed  in  applications  to  real  problems.  It  follows 
from  (48)  that  for  the  equal-atom  case  (fx  —  fc  or 
p  =  1)  the  composition  term  is 


Qik  — 


1 

(7TTF1’ 


(49) 
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while  for  extreme  heterogeneity  (p  >  1)  and  not  too 
many  light  atoms  (r  p 2  >  /)>  we  have 

(») 


Let  us  now  rewrite  the  expansion  (35)  in  a  symboilc 
form,  in  terms  of  the  composition-dependent  quanti¬ 
ties.  Making  use  of  equations  (38)— (41),  we  can  write 

Pc(|£|)  =  P®(|  E\)  SGC,  (51) 

with 

SqC  =  1  +  ^4  04  +  <4  06  +  (^8  08  +  ^8 

+  «o  010  +  <0  04  0e)  +  •••> 
where  the  multipliers  d2(c,  d^k  and  d\k  contain  Hermite 
polynomials  of  order  2k,  space-group  constants  and 
numerical  coefficients.  In  the  above  series,  as  in 
equations  (35)  and  (36),  the  terms  are  arranged 
according  to  the  increasing  order  of  the  orthogonal 
polynomials  and  these  series  are  thus  examples  of  the 
Gram-Charlier  expansion  (Cramer,  1951),  abbreviated 
here  as  GC.  A  possible  disadvantage  of  the  GC 
arrangement  is  that  the  terms  are  not  necessarily 
arranged  according  to  a  decreasing  order  of  magnitude 
(Cramer,  1951),  and  when  this  is  so,  another  form  of 
the  expansion  in  which  terms  with  the  same  orders 
are  grouped  together,  is  strongly  recommended  by 
Cramer.  The  latter  is  known  as  the  Edgeworth 
arrangement  (Cramer,  1951),  abbreviated  here  as 
ED. 

As  can  be  seen  from  (49)  and  (50),  Q6  01  and 
08  ~  040e>  and  the  expansion  (51)  can  be  rewritten 
for  the  two  extreme  compositions  as 


SED 


Ck  +  d  's  _l 

- 5—  + 

/* 


d’  +  d" 


10 


+ 


(52) 


where  ^  stands  for  l  +  r,  the  number  of  atoms  in  the 
asymmetric  unit  in  the  equal  atom  case,  or  r,  the 
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number  of  heavy  atoms  present.  This  is  an  Edgeworth- 
type  arrangement  but  the  fifth  term  in  (52)  is 
incomplete  since  there  are  more  terms  of  the  order  of 
/it1 2 * 4  in  the  higher,  so  far  unavailable  terms  of  the 
expansions  (35)  and  (36).  When  /x  is  not  too  small,  this 
ED  arrangement  is  a  clearly  asymptotic  expansion 
and  is  likely  to  converge  quickly,  e.g.,  the  fast  conver¬ 
gence  of  the  correction  term  in  the  equal-atom  case 
(ju,  =  24),  to  values  differing  only  slightly  from  unity 
[cf.  Figs.  1(a)  and  1(6)]  is  due  to  the  inverse  power 
series  character  of  the  expansion.  It  was  also  seen  in 
simulations,  not  described  here,  that  the  first  four 
terms  of  5ED  give  rise  to  a  somewhat  better  agree¬ 
ment  with  the  histogram  than  that  given  by  a  five- 
term  GC  expansion,  provided  at  least  two  heavy 
atoms  are  present.  In  fact,  p  =  2  in  equation  (52) 
already  promises  some  convergence  (cf.  also  Shmueli 
and  Wilson,  1981).  However,  in  the  important 
case  of  one  very  heavy  atom  among  not  too  many 
light  ones,  all  the  composition  terms  are  of  the 
order  of  unity  or  decrease  very  slowly  with  increas 
ing  order,  the  ED  arrangement  loses  its  a  priori 
asymptotic  character  and  the  rate  of  convergence 
depends  mainly  on  the  d2k,  d'2k  and  d"2k  coefficients  in 
(51)  and  (52). 

Rather  extensive  numerical  tests  and  simulations 
were  carried  out  in  order  to  examine  the  applicability 
of  the  two  arrangements  to  a  description  of  non-ideal 
probability  densities  and  cumulative  distributions.  A 
detailed  report  of  the  results  is  beyond  the  scope  of 
this  paper  and  only  the  main  conclusions  are 
presented. 

1 .  In  the  equal-atom  case,  both  ED  and  GC  arrange¬ 
ments  are  very  close  to  the  corresponding  ideal 
p.d.f.’s  (Wilson,  1949)  and  the  effects  of  space- 
group  symmetry  are  very  small. 

2.  In  the  case  of  Ct  Xr  asymmetric  units  with  p>  1, 

the  histograms  are  well  accounted  for  by  ED  and 

GC  arrangements,  provided  r  ^  2.  Occasionally, 
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the  ED  arrangement  is  better  than  the  GC  one 
but  the  advantage  seems  to  be  marginal  in  the 
cases  studied. 

3.  In  the  one-heavy-atom  case,  all  the  so  far  simulated 
histograms  could  be  explained  by  three-term,  four- 
term  or  five-term  GC  expansions.  The  perform¬ 
ance  of  ED  arrangements  is  somewhat  worse  in 
some  space  groups  and  quite  unacceptable  in 
others,  e.g.,  in  PI  and  PI. 

Hence,  in  spite  of  the  preferable  mathematical 
properties  of  the  Edgeworth  arrangement,  it  was 
rejected  in  favour  of  the  Gram-Charlier  arrangement 
insofar  as  applications  to  the  heavy-atom  problem  in 
intensity  statistics  are  concerned.  The  inapplicability 
of  the  ED  form  to  PI  is  illustrated  in  Fig.  3  in  which 
we  compare  iV(|P|)  distributions  based  on  (i)  ideal 
centric  and  acentric  p.d.f.’s,  (ii)  five-term  GC  arrange¬ 
ments  for  PI  and  PI  and  (iii)  four-term  ED  arrange¬ 
ments  for  these  space  groups  with  the  cumulative 
distribution  of  |Fj,  as  recalculated  from  the  solved  Pi 
structure  of  C6N404Cl2Pt  (Faggiani,  Lippert  &  Lock, 
1980;  Shmueli,  1982)*.  It  should  be  pointed  out  that 
the  small  departure  of  the  c.d.f.’s  for  PI  from  the  ideal 
acentric  c.d.f.  is  due  to  the  fact  that  the  assumed 
asymmetric  unit  for  PI  is  exactly  twice  the  size  of  the 
unit  of  PI,  and  contains  two  platinum  atoms  rather 
than  one  heavy  atom  only,  present  in  PI.  The  above 
picture  may  change  when  several  additional  terms  are 
included  in  the  expansions  (35)  and  (36). 

The  equations  of  generalized  intensity  statistics, 
given  at  the  end  of  the  previous  section  are  presented 
in  the  Gram-Charlier  arrangement  and  form  the 
basis  of  the  author’s  computer  program  which  is 
described  elsewhere  (Shmueli,  1982)  and  is  being 
further  developed. 

*On  the  other  hand,  a  good  agreement  of  the  recalculated 
N(\E\)  with  that  based  on  a  five-term  Gram-Charlier  expan¬ 
sion  for  P\  is  evident. 
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Fig.  3.  Cumulative  distribution  functions  for  a  one-heavy-atom 
structure. 


N(  \E  \  )  values,  recalculated  from  the  published  structure  of 
C6N404Cl2Pt  (Faggiani,  Lippert  &  Lock,  1980;  Shmueli, 
1982),  are  compared  with  non-ideal  distributions  based  on 
the  above  composition.  The  correct  space  group  is  PI. 

(a)  N(/E/)  values  recalculated  from  the  structure,  ( b )  five- 
term  GC  expansion  for  FI  (c)  five-term  GC  expansion  for 
PI,  (d)  four-term  ED  expansion  for  PI,  (e)  four-term  ED 
expansion  for  PI,  (/)  ideal  centric  N(/E/)  and  (g)  ideal 
acentric  N(/E/). 

The  author  wishes  to  thank  Professor  A.  J.  C.  Wilson 
for  his  comments  on  the  simulation  approach  des¬ 
cribed  above. 

The  computations  were  carried  out  at  the  Tel-Aviv 
University  Computation  Center,  with  CDC6600  and / 
or  CYBER  172  computers  and  a  NOS/BE  operating 
system. 
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Abstract 

Non-ideal  structure-factor  statistics  are  rederived  in  a 
general  manner  and  their  implementation  in  practical 
procedures  is  indicated  and  discussed.  The  derivation 
employs  standard  cumulant-moment  relationships 
for  a  real  random  variable,  as  well  as  some  known 
properties  of  the  cumulants.  These  properties  render 
the  derivation  very  simple  and,  at  the  same  time,  quite 
general.  The  previously  assumed  vanishing  of  the  odd 
moments  of  the  trigonometric  structure  factor  is 
supported  by  the  result  of  a  computation  of  the  third 
moment  of  this  quantity  for  a  wide  range  of  space 
groups  of  high  symmetry.  Complete  five-term  expan¬ 
sions  for  the  probability  density  function  of  E  for 
centrosymmetric  space  groups  were  obtained,  without 
resorting  to  the  assumption  of  general  positions. 
Previously  derived  simplifications  of  these  statistics 
are  applied  here  to  the  standard  even-even  cumulant- 
moment  relationships  and  the  results  are  thereby 
brought  to  a  very  concise  functional  form,  which 
applies  to  non-centrosymmetric  statistics  as  well. 

Practical  aspects  of  implementation  of  these 
statistics  in  a  computer  program,  including  moments 
of  the  trigonometric  structure  factor,  angular 
dependence  of  the  composition-dependent  terms  and 
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an  efficient  arrangement  of  the  computations  are  dis¬ 
cussed  and  some  outstanding  problems  of  theoretical 
as  well  as  practical  interest  are  briefly  indicated. 


1.  Introduction 

Statistical  properties  of  sums  of  independent  random 
variables  can  often  be  represented  by  ideal  distributions 
which  follow  from  the  central-limit  theorem  (e.g., 
Cramer,  1951).  Such  distributions  form  the  basis  of 
the  Wilson  (1949)  statistics  and  have  been  extensively 
used  in  crystallography.  If,  however,  the  relative 
contributions  of  these  variables  to  the  value  of  the 
sum  are  widely  different,  e.g.,  if  one  of  them  is  out¬ 
standingly  large,  the  actual  distribution  may  deviate 
significantly  from  the  expected  ideal  one,  and  such 
non-ideal  distributions  may  often  be  approximated 
by  introducing  correction  terms  that  depend 
explicitly  on  the  cause  of  departure  from  the  ideal 
situation  (Cramer,  1951). 

The  distribution  of  diffracted  intensity  often  poses 
such  problems  to  the  crystallographer,  and  the  subject 
of  generalized  distributions  has  been  rather  extensive¬ 
ly  investigated  (Karle  &  Hauptman,  1953;  Hauptman 
&  Karle,  1953;  Bertaut,  1955;  Klug,  1958;  Foster  & 
Hargreaves,  1963a;  Srinivasan  &  Parthasarathy,  1976). 
However,  apart  from  a  few  applications  of  the 
generalized-moment  method  (Foster  &  Hargreaves, 
1963a,  19636;  Goldberg  &  Shmueli,  1971)  no  practical 
use  seems  to  have  been  made  of  these  non-ideal 
statistics. 

Recent  investigations  of  generalized  intensity 
statistics  (Wilson,  1978;  Shmueli,  1979;  Shmueli  & 
Kaldor,  1981;  Shmueli  &  Wilson,  1981)  concentrated 
mainly  on  the  generalization  of  their  symmetry 
dependence  and  on  the  simplification  of  the  formalism, 
as  these  developments  appeared  to  be  of  major 
importance  in  promoting  the  applicability  of  such  non¬ 
ideal  statistics.  A  further  simplification,  accompanied 
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by  a  discussion  of  some  practical  aspects  and  by 
several  encouraging  comparisons  with  distributions 
which  were  recalculated  from  published  metallo- 
organic  structures  has  been  given  (Shmueli,  1982a) 
and  a  summary  of  these  developments  has  been 
reported  (Shmueli,  Kaldor  &  Wilson,  1981). 

This  extension  of  the  latter  reference  aims  at  (/) 
standardizing  the  derivations  of  the  non-ideal  statistics, 
without  resorting  to  the  frequently  made  assumption 
that  all  the  atoms  have  to  be  located  in  general 
positions,  by  applying  the  conventional  cumulant- 
moment  relationships  to  the  centrosymmetric  case, 
and  (if)  discussing  the  implementation  of  these  results 
in  a  routine  computer  program.  It  is  also  intended  to 
exploit  the  simplifications  achieved  in  the  former 
derivations  (Shmueli  &  Wilson,  1981;  Shmueli,  1982a) 
in  trying  to  represent  the  expressions  as  concisely 
as  is  practicable. 

2.  Non-ideal  intensity  statistics 

This  derivation  of  generalized  moments  and  distri¬ 
butions  for  the  centrosymmetric  case  is  subect  to  the 
assumptions  that  ( i )  the  atomic  contributions  to  the 
structure  factor  may  be  regarded  as  independent 
random  variables  and  ( ii )  the  effects  of  anomalous 
dispersion  on  the  intensity  distribution  are  negligible. 
The  contributions  in  (/)  are  those  of  groups  of  identical 
atoms  related  by  space-group  operations,  as  given  by 
the  atomic  trigonometric  structure  factors,  and 
independence  implies  that  there  is  no  non-crystallo- 
graphic  symmetry  within  the  asymmetric  unit. 
Atoms  in  fixed  special  positions  obviously  do  not 
qualify  as  such  contributors  and  their  effect  on  the 
distribution  must  be  treated  separately. 

The  structure  factor  is  given  by 


m 


(1) 
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where  m  is  the  number  of  atoms  in  the  asymmetric 
unit,  fj  is  the  atomic  scattering  factor  and  Jj  is  the 
trigonometric  structure  factor  of  the yth  atom  (Wilson, 
1978). 

Both  F  and  Jj  can  here  be  regarded  as  real  random 
variables  with  zero  means,  and  all  the  odd  moments 
of  F  vanish.  The  distribution  of  F  is  thus  determined 
by  its  even  moments  but  can  also  be  represented  in 
terms  of  the  cumulants  of  F  which  are  related  to  the 
moments  in  a  known  and  often  useful  way  (e.g., 
Cramer,  1951).  Considering  general  standard  relation¬ 
ships  between  cumulants  and  moments  about  the  mean 
(central  moments)  [e.g.,  equations  (3.43)in  Kendall  & 
Stuart,  1969]  it  is  seen  that  if  all  the  odd  moments  of  a 
distribution  vanish,  only  even  cumulants  remain  and 
the  first  five  such  cumulants  reduce  to 

*4=/*4— 3 

K6  =  /V ~  1 5/^2  +  30^2* 

^8= /^8 — 28^z6/x2 + 420/u.4/x| —  630/**— 35  /x| 
and 

^ior=/u’io~45/i8/*2+  1260/xe/*f  —  18900/u,4^4-22680/xf 

— 210/*6/*4+3150  /*•/**  (6) 

where  K2p  and  /x2p  denote  the  cumulants  and  central 
moments  respectively,  for  a  distribution  of  a  real 
random  variable  (Kendall  &  Stuart,  1969). 

The  cumulants  of  F,  to  be  denoted  by  Kip(F),  are 
thus  given  by  the  l.h.s.  of  equations  (2)-(6)  where 
the  moments  /x2p  in  the  r.h.s.  of  these  equations  are 
replaced  by  the  corresponding  moments  of  F.  Thus 

*a(F)=<F*>  (7) 

UF)=(F*)-3(F>)\  (8) 

and  so  on  for  the  remaining  cumulants.  These  relations 
will  be  referred  to  as  the  F-version  of  equations  (2)-(6). 

According  to  the  additivity  of  cumulants  of 


(2) 

(3) 

(4) 

(5) 
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independent  random  variables  ( e.g .,  Cramer,  1951, 
p.  192),  the  2/?th  cumulant  of  F  is  given  by  a  sum  of 
2pth  cumulants  of  the  atomic  contributions  fjJj.  Thus 

m 

K1Pm=  2  <?) 

y=! 

Another  result  of  the  statistical  theory  tells  us  that  if 
a  random  variable,  say  x,  undergoes  a  linear  trans¬ 
formation 

y=ax-\-b,  (10) 

where  a  and  b  are  constants,  the  rth  cumulant  of  the 
transformed  variable  is  given  by 

Kr(y)=cf  Kr(x)  (11) 

and  is  invariant  to  the  shift  of  the  origin  (e.g.,  Kendall 
&  Stuart,  1969,  p.  68).  Applying  this  result  to  equation 
(9),  we  have 

m 

K'JF)  =  J  ff’l W),  (12) 

i= 1 

where  the  transformation  consists  of  multiplying  Ji 
by  the  constant  f}. 

We  can  now  relate  the  cumulants  of  the  trigono¬ 
metric  structure  factors  J}  to  their  moments,  which 
depend  on  the  space-group  symmetry  of  the 

crystal  (Wilson,  1978;  Shmueli  &  Kaldor,  1981),  by 
constructing  a  /-version  of  equations  (2)-(6)  and  the 
relationships  between  the  cumulants  of  F  and  the 
composition  and  symmetry  of  the  crystal  follow 
readily  from  (12).  However,  before  doing  so  we  must 
note  that  the  even  cumulants  depend  on  odd  moments 
as  well  and  the  vanishing  of  the  odd  moments  of  /  is 
less  obvious  than  in  the  case  of  F.  Specifically,  terms 
depending  on  /u|,  p,3,  ^  and  /u,7/u,3  appear  in  the 

complete  versions  of  equations  (4),  (5)  and  (6)  [ cf 
Kendall  &  Stuart,  1969,  p.  71].  By  analogy  with  the 
Wilson  (1978)  statistics  of  the  trigonometric  structure 
factor,  its  odd  moments  are  expected  to  vanish 
(Shmueli  &  Wilson,  1981)  and  were  also  assumed  to 
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be  zero  by  previous  authors  ( e.g Karle  &  Hauptman, 
1953;  Bertaut,  1955).  However,  it  was  stated  by 
Foster  &  Hargreaves  (1963a)  and  recalled  by  Srini- 
vasan  &  Parthasarathy  (1976)  that  the  mixed  odd- 
even  partial  moments  of  the  trigonometric  structure 
factor  vanish  for  low  symmetries  but  may  be  non-zero 
for  symmetries  higher  than  orthorhombic.  No 
demonstration  of  the  truth  of  this  statement  was  given. 
This  implies  that  odd  moments  of  J  may  exist  for 
higher  symmetries  and  such  a  possibility  deserves 
some  consideration,  in  spite  of  the  fact  that  odd 
moments  of  J  are  bound  to  have  zero  lower  limits 
(c/.  Shmueli,  19826).  In  order  to  test  this  possibility, 
the  third  moment  of/ was  computed  by  the  method  of 
Shmueli  &  Kaldor  (1981)  for  all  the  trigonal,  hexa¬ 
gonal  and  primitive  tetragonal  space  groups,  and  was 
found  to  vanish  for  all  of  them.  Although  we  intend 
to  complete  these  computations  of  <  /3)  and  perform 
them  also  for  <(/5>,  we  feel  justified  in  using  equations 
(2)-(6)  as  they  stand,  in  their  /-version. 

Denoting  the  first  five  successive  even  moments  of 
/  by  p,  q,  r,  s  and  t  respectively  ( cf.  Shmueli  &  Wilson, 
1981)  we  have,  using  (12)  and  the  /-version  of  (2)-(6) 


m 


(F)  =  J  fjPj* 
j-1 


m 


Kt  (F)  =  J  /?  fe  -  3 tf), 
7  =  1 


(13) 


(14) 


m 


(F)  =  2  /“  (0-15  q,  Pi  +  30  P3),  (15) 

j-1 


m 


*8  (f )  =  2  /'  (0 

7  =  1 


28  rjpj  +  420  qj  p) 


and 


630  p)  -  35  q))  (16) 


Intensity  Statistics  89 


(*■)  =  2  f'° (t’  - 45  sj  ft  +  1260  rJ 

—  18900  qjP)  +  22680  p]  -  210  r,  qj 

+2150  q)P}).  (17) 

Comparing  (13)  and  (14)  with  (7)  and  (8)  respectively 
we  obtain 

m 

(f»>  =  2//ft.  (18) 

7=1 
m 

<F‘>  =  2-7  (ft  -  3  ft')  +  3  <f2>’  (19) 

7  =  1 

in  agreement  with  Wilson  (1978),  and  similar  com¬ 
parisons  readily  lead  to  general  equations  for  <F6>, 
<(F8)  and  (F10)  that  reduce  to  those  given  by 
Shmueli  &  Wilson  (1981)  and  Shmueli  (1982a)  for 
the  case  of  all  atoms  occupying  general  positions  of  a 
centrosymmetric  space  group. 

The  above  results  can  be  expressed  in  terms  of  the 
normalized  structure  factor  E  by  noting  that  (F2k)>  = 
(F2kyi(F2)k  and  hence 

K2p  (E)  =  K2p  (F)l(F2y.  (20) 

The  moments  of  E  can  now  be  readily  related  to  the 
composition  and  symmetry  of  the  crystal  but  the 
detailed  structure  of  these  moments  will  not  be 
required  for  the  specification  of  the  available  terms  in 
the  probability  density  function  (p.d.f.)  of  E.  This 
function  is  given  by 

/>,(£)  =  ( 2„)-«exp(-|!) 

(2l) 

k= 2 

(Shmueli  &  Wilson,  1981 ;  Shmueli,  1982a),  where  H2k 
are  Hermite  polynomials  as  defined,  e.g.,  by  Abramo- 
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witz  and  Stegun  (1972),  and  the  Gaussian  multiplying 
the  ‘correction’  expansion  is  based  on  the  Wilson 
(1949)  centric  distribution.  The  moments  of  E  can 
be  related  to  the  expansion  coefficients  A2k  by  the 
integral 


< E2k >  = E2k  PC(E)  d E  (22) 


which  leads  to 


<2s2k>  =  (2k-\)\\  +V 


P= 2 


!3 


(2k-\)\\ 

(2/7—1)!! 


2P’ 


(23) 


where  (2k- 1) ! !  =  (2k) !/  [2kk !]  (Shmueli,  1982a). 
Upon  substituting  the  moments  from  (23)  into  the 
^-version  of  equations  (2)— (6)  and  expressing  the 
coefficients  A2p  in  terms  of  the  cumulants  of  E,  we 
obtain 


e2A4  =  K4(F), 

(24) 

e*A6  =  K6(F), 

(25) 

e%  =  /s:8(F)+35[/i:4(/0]2  (26) 

and 

=  *io  (F)  +  210K6(F)UF),  (27) 

where  2  =  (F2).  For  the  case  of  all  atoms  occupying 
general  positions  the  above  results  reduce  to  those 
obtained  by  Shmueli  &  Wilson  (1981)  and  Shmueli 
(1982a),  who  used  a  more  lengthy  procedure  involving 
a  detailed  direct  treatment  of  the  moments  of  |  Fj 
and  |  E  |.  The  p.d.f.  of  the  magnitude  of  E  is  obtained 
by  doubling  the  normalization  constant  (277)_1/2  and 
replacing  E  with  |is|  throughout  equation  (21). 

The  above  approach,  employing  a  straightforward 
application  of  well-known  properties  of  cumulants 
and  their  relationships  to  central  moments,  can  be 
extended  to  the  non-centrosymmetric  case  as  well. 
However,  a  bivariate  distribution  must  then  be 
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considered  and  this  leads  quite  naturally  to  mixed 
moments  and  cumulants  and  hence  to  a  rather 
complicated  algebra,  both  in  the  derivation  and  the 
simplification  of  the  resulting  expressions.  On  the 
other  hand,  the  expressions  obtained  by  the  use  of 
moments  alone  (Shmueli  &  Wilson,  1981;  Shmueli, 
1982a)  have,  apartfromthe  p.d.f.’s,  identical  functional 
forms  for  the  centrosymmetric  and  non-centrosym- 
metric  space  groups  that  follow  directly  from  their 
unified  derivations,  and  are  of  a  comparable  com¬ 
plexity  to  those  given  above  for  the  centrosymmetric 
case.  Further  simplifications  were  achieved  in  the 
direct  moment  approach  (Shmueli,  1982a)  and  it  is 
interesting  to  note  that  they  are  applicable  to  the 
general  even-even  cumulant-moment  relationships 
given  by  equations  (2)-(6).  These  can  be  written  as 

K2r  —  «2r  +  a'v  >  (28) 

where 


r 

«2 r  =  ^  (-1)r~'’(r-/,)!  arp^2p/A2r“,’  +  (-l)r-1(''-l)! 
P—2 

aroK>  (29) 

ag=0,  a's=  — 35/*|,  a^0=  —  210^6/a4+3150/a^i2 

(30) 


and 

_  fr\( 2r— 1)11 
rp  W(2p-1)!! 


(31) 


[See  also  Shmueli,  pp.  53-82  above.]  Only  three 
numerical  constants  are  left  in  the  first  five  even- 
even  K-/u,  relationships,  thus  enabling  a  more  concise 
presentation  of  the  cumulants  and  hence  of  the 
whole  formalism.  E.g.,  the  first  five  even  cumulants 
of  F  can  now  be  written  as 


m 


*„(*•)=  2 /fK,,  +  < 4,j)  (32) 

/-I 
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where  a2pj  and  a'2pj  are  given  by  (29)  and  (30)  respec¬ 
tively,  with  the  moments  yb2k  replaced  by  (Jfk) 
throughout  these  equations. 

We  conclude  this  section  with  a  note  on  the 
corresponding  expressions  for  non-centrosymmetric 
space  groups.  These  non-ideal  statistics  are  based  on 
a  p.d.f.  of  the  magnitude  of  E  or  F  [cf.  equation  (36), 
Shmueli  (19826)],  and  differ  from  the  centrosymmetric 
statistics  in  the  definition  of  arp,  which  is  given  by 

(33) 

and  in  the  values  of  the  numerical  constants  in  (30). 
The  acentric  version  of  the  latter  is  obtained  by 
replacing  35,  210  and  3150  with  18,  100  and  900 
respectively  (Shmueli  1982a,  19826).  There  are  also 
exact  acentric  analogues  of  equations  (13)- (17)  or 
their  unified  form  (32),  as  far  as  their  relationships  to 
the  moments  of  |  F  |  and  the  expansion  coefficients  of 
the  acentric  p.d.f.  are  concerned.  However,  the  left- 
hand  sides  of  the  acentric  versions  of  (13)— (17)  can¬ 
not  be  simply  interpreted  as  cumulants  of  |  F  \  . 

These  simple  results  for  the  non-ideal  statistics  of 
the  magnitude  of  a  complex  variable,  the  real  and 
imaginary  parts  of  which  are  sums  of  real  random 
variables,  contribute  to  the  applicability  of  generalized 
intensity  statistics  and  also  appear  to  be  of  a  more 
general  interest. 


3.  Practical  considerations 

Applications  of  the  above  formalism  to  the  resolution 
of  space-group  ambiguities  in  the  case  of  extreme 
atomic  heterogeneity  have  been  described  by  Shmueli 
(1982a).  It  was  shown  that  in  cases  in  which  all  the 
atoms,  including  the  outstandingly  heavy  scatterer, 
are  located  in  general  positions  and  there  is  no 
conspicuous  hypersymmetry  in  the  structure,  the 
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appropriately  simplified  versions  of  the  non-ideal 
statistics  discussed  in  the  previous  section  lead  to  a 
reliable  indication  of  the  known  space-group  symmetry. 
One  of  the  examples  is  shown  in  Fig.  3  of  the  paper 
by  Shmueli  (p.  8 1  above).  The  latter  paper  illustrates 
the  effects  of  atomic  heterogeneity  and  space-group 
symmetry  on  the  p.d.f.  of  the  structure  amplitude, 
reviews  the  mathematical  and  crystallographic 
considerations  which  lead  to  the  construction  of  the 
generalized  statistics  that  cope  with  such  effects  and 
discusses  in  some  detail  the  convergence  behaviour  of 
the  expansions  [(e.g.,  equation  (21)]  for  the  probability 
density  functions  of  |  E  | .  Of  the  two  known  arrange¬ 
ments  of  terms  in  such  expansions,  the  Edgeworth 
and  the  Gram-Charlier  forms  (Cramer,  1951),  the 
latter  was  shown  to  be  preferable  and  all  the  expansions 
presented  there  are  given  in  their  Gram-Charlier 
arrangements  (Shmueli,  19826).  We  now  wish  to 
summarize  some  practical  considerations  which  may 
be  of  interest  to  the  reader  who  wishes  to  implement 
these  generalized  statistics  in  routine  structure 
determination  procedures. 

To  be  specific,  let  us  rewrite  explicitly  one  of  the 
cumulants  of  E,  say  KS(E),  using  equations  (16)  and 
(20)  above.  We  have, 

m 

K,(E)  =  2  ff  (s,  -  28  r,  p,  +  420  q,  p f 
i- 1 

m 

-  630p)-35qJ)l(2'fip1)‘,  (34) 

y-i 

where  p,  q,  r  and  s  are  the  second,  fourth,  sixth  and 
eighth  moments  of  the  trigonometric  structure  factors 
respectively.  Assuming  that  all  the  atoms  are  located 
in  general  positions  or,  in  statistical  terms,  that  all 
the  atomic  trigonometric  structure  factors  have  the 
same  distribution,  we  can  drop  the  subscripts  from 
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p,  q,  r  and  s  and,  after  using  (29),  equation  (34)  can 
be  simplified  to 


=  (r«  —  “43  n  +  2“43  74  —  6o10 

-35  rDQ„  (35) 

where 

y2«  =  <i-'r>/<i-'i!>*, 

m  m 

&4=2/f/(2^r>  <36) 

7=1  7  =  1 

and  given  the  y2k’s  and  Q%k s,  the  cumulants  for  the 
centric  distribution  [cf.  equations  (24)-(27)]  and  their 
acentric  analogues  can  be  most  readily  computed,  and 
the  theoretical  distributions  evaluated. 

As  for  the  standardized  moment  ratios  y2k,  which 
link  these  statistics  to  the  space-group  symmetry, 
closed  expressions  are  so  far  available  for  the  triclinic, 

Table  1 .  Moments  of  the  trigonometric  structure  factor 
for  triclinic,  monoclinic  and  orthorhombic  space 
groups  ( except  Fddl  and  Fddd ). 

The  values  of  y2k,  as  defined  in  (36),  are  based  on  closed  expres¬ 
sions  for  these  standardized  moments,  given  by  Shmueli  (1982a). 
The  factors  of  lattice  multiplicity  were  excluded  and  must  not 
be  used  in  conventional  comparisons.  The  values  of  the  moments 
of  |  J\  can  be  recalculated  by  noting  that  <  | /|  “>  equals  the 


order  of  the  point  group.  The  values  of  y2k  are  valid  for  struc¬ 
tures  having  all  the  atoms  in  general  positions  and  data  sets 
consisting  of  general  reflexions. 

Point  group(s) 
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monoclinic  and  orthorhombic  space  groups  (except 
Fddl  and  Fddd)  (Shmueli,  1982a)  and  we  present  for 
convenience  the  numerical  values  of  y4,  y6,  y8  and 
y10,  for  the  above  symmetries,  in  Table  1.  These 
moment  ratios  are  sufficient  in  order  to  compute 
(the  available)  five-term  expansions  for  the  low- 
symmetry  space  groups.  Values  of  y4  and  y6  were 
obtained  for  all  the  space  groups  and  all  the  hkl 
subsets  leading  to  different  intensity  distributions,  by 
Shmueli  &  Kaldor  (1981)  and  those  of  y8  will  be 
presented  elsewhere  (Shmueli  &  Kaldor,  1982). 
Hence,  statistical  tests  including  the  cumulative  distri¬ 
bution  of  |f?|  as  a  three-term  expansion  as  well  as 
the  fourth  and  sixth  moments  of  |  E  |,  can  now  be 
carried  out  for  any  symmetry  and  the  availability  of 
moments  for  five-term  expansions  should  not  be 
long  delayed. 

A  proper  computation  of  the  composition-depen¬ 
dent  terms  should  allow  for  their,  albeit  not  too 
strong,  dependence  on  sin  0/ A  It  was  found  conve¬ 
nient  to  place  this  computation  after  the  data  for  the 
Wilson  plot  have  been  evaluated  since  Q2k  can  then  be 
computed  as  a  weighted  average  over  the  shells  used 
in  the  construction  of  the  plot,  the  weight  being 
taken  as  the  number  of  reflexions  contained  in  such 
a  shell. 

It  therefore  appears  advisable  to  incorporate  the 
generalized  statistics  in  a  routine  which  computes 
normalized  structure  amplitudes  [  E  |  and  also  evalu¬ 
ates  the  experimental  statistics  of  this  quantity,  or  to 
use  the  output  of  such  a  routine  (scattering-factor 
constants,  sin  0/A  ranges  and  weights  and  experi¬ 
mental  statistics)  as  an  input  to  a  program  which 
deals  with  generalized  intensity  statistics  and  compares 
the  experimental  with  the  possible  theoretical  distri¬ 
butions. 

The  procedure  adopted  by  one  of  us  (U.S.)  was  to 
modify  the  locally  available  program  NORMAL 
(MULTAN  78,  Main  et  al.,  1978)  so  that  the  required 
information  is  output  to  a  file  which  is  re-edited  by 
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adding  the  required  values  of  y2k  and  a  few  control 
parameters.  This  file  is  then  input  to  the  local  intensity 
statistics  routine,  INSTAT.  Of  course,  the  input 
to  such  a  routine  can  be  reduced  to  a  specification 
of  the  space  groups  for  which  theoretical  statistics 
have  to  be  computed  and  the  number  of  the  required 
expansion  terms. 

There  are  several  extensions  of  the  available  theore¬ 
tical  statistics  which  may  be  of  interest  from  theore¬ 
tical  as  well  as  experimental  standpoints.  We  are  now 
studying  generalized  statistics  which  account  for  the 
presence  of  complex  scattering  factors  (Wilson  & 
Shmueli,  1982).  These  may  take  care  of  significant 
effects  of  anomalous  dispersion  (Wilson,  1980)  in 
highly  heterogeneous  asymmetric  units,  and  are 
related  to  distributions  arising  from  partial  centro- 
symmetry  ( e.g .,  Srinivasan  &  Parthasarathy,  1976). 
The  effect  of  fixed  special  positions,  already  investi¬ 
gated  by  Karle  &  Hauptman  (1953)  and  Hauptman  & 
Karle  (1953),  is  of  a  considerable  interest  and  in  need 
of  simplification  and  generalization,  and  last,  but  not 
least,  the  effect  of  variable  special  positions  can  be 
accommodated  by  the  formalism  derived  in  the 
previous  section,  with  some  minor  modifications,  when 
the  moments  of  the  corresponding  trigonometric 
structure  factors  become  available  (cf  Wilson,  1978). 
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Abstract 

The  cumulative  intensity  distribution  function  N(z), 
as  well  as  the  probabilities  for  the  sign  and  phase 
relationships  remaining  valid,  [P  +  and  P  (<£)  res¬ 
pectively,  as  used  in  direct  analytical  determination  of 
crystal  structures]  depend  principally  on  the  nature 
of  distribution  of  structure-factor  components. 

So  far,  the  Gaussian  distribution  of  structure- 
amplitude  components  has  been  widely  used  for 
both  these  purposes.  Recently,  however,  for  N(z ) 
tests,  near-Gaussian  expressions  like  Gram-Charlier 
and  Edgeworth  series  have  received  extensive  atten¬ 
tion.  Expressions  for  P+  and  P  if)  have  also  been 
derived  on  the  basis  of  similar  distribution  functions. 
A  correlation  between  these  two  types  of  investigation 
has  been  attempted  in  the  present  work.  It  is  stressed 
that  one  cannot  empirically  or  intuitively  assume  a 
particular  distribution  to  hold  good  for  a  given 
crystal.  The  distribution  function  has  to  be  and  can 
be  determined  on  the  basis  of  the  comparison  of 
experimental  N(z)  values  with  those  calculated  on 
the  basis  of  different  distribution  functions.  Having 
obtained  the  best-fitted  distribution  function,  the 
expression  for  P+  or  P{f)  corresponding  to  this 
distribution  function  should  be  used  in  programs  for 
direct  determination  of  signs  or  phases  for  crystal- 
structure  determination.  In  course  of  the  present 
investigation  expressions  for  N0(z)  and  Nfz)  as  well 
as  for  P+  and  P(</>)  respectively  based  on  Edgeworth, 
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Rayleigh  and  some  other  near-Gaussian  distributions 
have  been  worked  out.  The  case  of  crystals  containing 
atoms  some  of  which  are  outstandingly  heavy  has 
also  received  attention  and  it  has  been  shown  that 
even  when  the  Gaussian  type  of  distribution  holds 
good  the  distribution  is  Gaussian  with  shifted  peaks 
— the  peak-shift  depending  on  the  difference  between 
the  weights  of  the  heavy  and  light  atoms. 

1.  Introduction 


During  the  early  fifties  of  the  present  century,  import¬ 
ant  developments  in  theoretical  X-ray  crystallography 
have  taken  place.  Wilson  (1949)  enunciated  the 
principles  of  intensity  statistics  while  Sayre  (1952) 
gave  the  relation  connecting  the  phase  of  a  given 
reflection  with  the  magnitudes  and  phases  of  all  other 
reflections.  From  this,  the  phase  of  one  reflection  can 
be  expressed  in  terms  of  phases  of  two  other  reflections 
with  some  probability.  An  expression  for  this  proba¬ 
bility  has  been  calculated  by  Cochran  &  Woolfson 
(1955)  and  by  Cochran  (1955)  for  centrosymmetric 
and  non-centrosymmetric  crystals  respectively  on  the 
basis  of  Gaussian  statistics. 

Recently  for  centrosymmetric  crystals  Giacovazzo 
(1976)  introduced  the  Gram-Charlier  series  and 
obtained  an  expression  for  P+{shsh'sh_h'tt  -j- 1),  the 
probability  that  sh  sh'  sh~h'  ~  1 .  This  probability  is 
slightly  different  from  that  given  by  Cochran  & 
Woolfson  (1955).  The  two  expressions  are  as  follows: 

(a)  Cochran  &  Woolfson  (1955)  (Gaussian  distri¬ 
bution) 

P+  Cw*w)  -  £  +1  tanh  |  EhEh'Eh_h’  |  (I) 


( b )  Giacovazzo  (1976)  (Gram-Charlier  Series) 
P+  c wsh-h')  ^  i  +  £  tanh  -i-  I  EhEh'Eh-h'  | 


(2) 
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where  A  and  B  are  complicated  functions  of  2Ts. 
Here,  E  (h)—  l/CT/?)172,  where  F(h)  and  fj  are  the 

structure  factor  and  the  atomic  scattering  factor  for 
the  reflection  (h  =  h,  k,  l )  respectively.  Similarly  for 
non-centrosymmetric  crystals  expressions  for  validity 
of  phase  relations  involving  triplets  like  ^x+^a+^3— 0 
[where  <f>h  is  given  by  F(h)  =  F(h)  |  exp  (i  <f>h)]  and 
also  for  quadruplets,  quintets,  sextuplets,  etc.  have 
been  worked  out  by  Giacovazzo  (1976)  and  Hauptman 
(1977)  based  on  Gram-Charlier  series  and  Rayleigh 
distributions  respectively.  These  are  different  from 
each  other  and  also  from  the  one  derived  by  Cochran 
(1955)  on  the  basis  of  the  Gaussian  distribution. 
Again  the  cumulative  probability  distribution,  N(z)= 
p  (z)  dz  where  p  (z)  dz  is  the  probability  of  z  lying 
between  z  and  z+dz  and  z=  |F]2,  has  been  found  to 
be  dependent,  for  centrosymmetric  crystals,  on 
whether  the  distribution  is  a  Gaussian  (Wilson,  1949) 
or  an  Edgeworth  series  (Mitra  and  Belgaumkar, 
1973)  or  a  Gram-Charlier  series  (Shmueli,  1979; 
Shmueli  &  Wilson,  1981).  Similar  results  are  expected 
to  be  valid  also  for  non-centrosymmetric  crystals. 

2.  The  Gaussian  distribution  law — its  applications, 
implications  and  extensions 

The  intensity  /  (h)  of  a  given  reflection  h  =  ( h ,  k ,  /) 


is  given  by  /  (h)  =  X2  (h)  +  Y2  (h), 


(3) 


where 


and 


r j  being  the  position  vector  of  the  yth  atom  in  the 
unit  cell.  There  are  N  atoms  in  the  unit  cell  and  j  has 
values  ranging  between  1  to  N.  Let  us  call  X  and  Y  the 
structure-factor  components  while  F  is  the  structure 
factor.  For  centrosymmetric  crystals  F(h)  =  2  2f(h) 
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while  for  non-centrosymmetric  crystals  equation  (3) 
holds  good.  By  invoking  the  central  limit  theorem, 
Wilson  (1949)  concluded  that  X  (h)  and  Y  (h)  obeyed 
the  Gaussian  statistics.  Starting  from  this,  Wilson 
(1949)  devised  methods  of  distinguishing  between 
centrosymmetric  and  non-centrosymmetric  unit  cells. 
An  improved  technique  of  achieving  this  end  was 
devised  by  Howells,  Phillips  &  Rogers  (1950). 
Assuming  the  Wilson  (1949)  expression  for  p(z),  they 
derived  expressions  for  the  cumulative  probability 
function  N(z)  and  showed  that  for  a  unit  cell  with  no 
centre  of  symmetry 

A^o  (z)  =  1  —  exp  (-  z)  (4) 

while  for  a  unit  cell  with  a  centre  of  symmetry 

Nx  (z)  =  erf  V z/2.  (5) 

Fitting  of  experimental  N  (z)  values  with  theoretical 
curves  for  N0  (z)  and  N1  (z)  is  expected  to  establish 
the  presence  or  absence  of  a  centre  of  symmetry. 
However,  Hargreaves  (1955)  observed  that  crystals 
with  heterogeneous  atoms — some  being  outstandingly 
heavy  and  the  remaining  relatively  light — show 
experimental  N(z )  values  not  predicted  by  equation 
(2)  or  (3).  Hargreaves  (1955)  correctly  attributed  the 
reason  to  the  fact  that  Wilson  (1949)  intensity  statistics 
is  valid  for  conglomeration  of  atoms  of  same  or  nearly 
same  atomic  number.  Theoretical  justification  of  this 
conclusion  lies  in  the  proper  enunciation  of  the 
central-limit  theorem  which  according  to  Feller  (1969) 
may  be  stated  as: 

Provided  that  Sv  S2,  ...,  Sn  are  all  independent 
random  variables  having  the  same  distribution  F  and 
that  the  average  of  Sk,  (Sky,  =  0  where  k  is  any  one 
of  the  values  of  n  and  that  variance  of  Sk,  var  Sk=l, 
the  distribution  of  the  function  S—iS^S 2+53-(-...-|- 
Sn )  n~ln  tends  to  be  Gaussian  in  the  limit  n->  oo. 

The  enunciation  of  the  central-limit  theorem  just 
quoted  is  simple,  and  if  the  stated  conditions  are 
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fulfilled  convergence  to  the  limit  as  n  increases  is 
quite  rapid.  However,  the  theorem  holds  even  if  the 
variables  Sk  have  different  distributions,  though 
convergence  to  the  Gaussian  form  may  be  much 
slower.  In  such  cases  a  Gram-Charlier  or  Edgeworth 
series  with  several  terms  may  be  satisfactory;  this 
point  has  been  discussed  in  some  detail  by  Shmueli 
and  Wilson  (Shmueli,  1979;  Shmueli  &  Wilson,  1981; 
Shmueli,  1982.  The  requirement  of  independence 
may  also  be  relaxed  (French  &  Wilson,  1978; 
Wilson,  1981). 

Collin  (1955)  and  Sim  (1958)  considered  the  case  of 
a  crystal  containing  one  heavy  atom  in  the  asymmetric 
unit  of  the  unit  cell.  The  heavy  atom  was  placed  at 
the  origin  and  the  unit  cell  was  chosen  accordingly. 
The  components  of  the  structure  factor  F(h)  given  by 

F  (h)  =  fH  +  SLfL  exp  (/2tt  h.r^) 

are  now  no  longer  the  random  variables  obeying 
Gaussian  or  near-Gaussian  statistics,  but  those  of 
are  supposed  to  be  so.  The  resultant  expres¬ 
sions  are 

N0(z)  =  (1+'*2)  exp  (-r2)  fzQ  exp  [-(l+r2)z] 

I0  [2r(  1  +  r2)1/2z1/2]  dz  (6) 

and 

Nx{z)  =  </>[() +r2)1/2z1/2-r]  +^[(l+r2)1/2z1/2+r],  (7) 
where 

r  =  ’  ^(X)  =  (27r)~1/2  /oexp(~2'2)df’ 

1  f77 

and  I0(x)  =  -  I  exp  (— ,xx;os</>)d</>,  a  hyperbolic  Bessel 

77  J  0 

function.  All  these  developments  have  been  made  on 
the  assumption  of  Gaussian  distribution  for  the 
structure-factor  components.  Klug  (1958)  and  Bertaut 
(1955)  suggested  the  use  of  near-Gaussian  distributions 
like  the  Gram-Charlier  series.  Mitra  &  Belgaumkar 


104  G.  B.  Mitra  and  Sikha  Ghosh 


(1973),  Shmueli  (1979)  and  Shmueli  &  Wilson  (1981) 
used  Edgeworth  and  Gram-Charlier  series  for  this 
purpose. 

Along  with  these  developments  in  intensity  statistics, 
a  parallel  development  was  made  in  the  field  of  evalu¬ 
ation  of  the  probability  of  phase  relations  being  valid. 
The  Sayre  (1952)  sign  relation  j(h)5(h')j(h— h')~l 
for  centrosymmetric  crystals,  the  Cochran  (1955) 
phase  relation  ^k+^h'+^h-h'  0  and  the  Hauptman  & 

Karle  (1953)  tangent  relation  tan  <£b  tan  0 £h'+4-b') 
can  all  be  derived  from  the  fundamental  Sayre  (1952) 
equation 


G(h)=0bF(h)=2’b,F(h')F(h-h')  (8) 


where 


and  0h  is  a  ratio  connecting  the  scattering  factors  of 
crystals  consisting  of  fictitious  atoms  with  electron 
distribution  p2( r)  while  the  actual  crystal  consists  of 
real  atoms  with  electron  density  p( r).  It  is  obvious 
that  equation  (8)  contains  the  assumption  that  the 
scattering  factors  of  all  atoms  in  the  crystal  are  same. 
Thus,  the  phase-determining  relations  are  truly  valid 
only  for  light-atom  structures  and  for  crystals  contain¬ 
ing  heavy  atoms,  these  relations  will  have  to  be 
modified.  Assuming  the  Gaussian  distribution  for 
structure-factor  components,  Cochran  &  Woolfson 
(1955)  derived  for  the  sign  relationship  for  centro¬ 
symmetric  crystals,  the  probability  of  its  validity  to 
be  the  expression  in  equation  (1),  while  Cochran 
(1955)  derived  for  non-centrosymmetric  crystals  the 
expression  for  the  probability  of  the  phase  relations 
to  be  valid  as 


WJ  = 


exp  [-2Q  sin2  £ 


(9) 


2tt  exp  (—<f>)  I0(<f> ) 
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where 

Q  =  2N~m  |  EhEh,Eh_w  | , 

The  above  is  not  and  is  not  meant  to  be  a  complete 
review  of  all  previous  work  on  the  subjects  mentioned 
above.  The  splendid  work  of  Hauptman  &  Karle 
(1953)  and  their  later  work  started  with  a  Rayleigh 
distribution  of  structure-factor  components  for  deri¬ 
ving  probability  of  sign  and  phase  relations  for  triplets 
quadruplets,  quintets,  sextuplets,  etc.  in  different 
neighbourhoods. 

Thus  it  is  evident  that  the  probability  of  phase 
relations  being  valid  and  the  cumulative  intensity 
distribution  function  N(z)  both  depend  on  the 
distribution  law  of  the  structure-factor  components. 
Hence  it  is  extremely  plausible  that  a  correlation 
should  exist  between  N0(z )  and  P(<f>)  and  between  Nx(z) 
and  A^V-Sh-h')-  The  aim  of  the  present  work  is  to 
investigate  this  correlation  in  the  case  of  different 
distribution  laws  of  the  structure  factor  components. 
The  importance  of  this  correlation  in  direct  deter¬ 
mination  of  crystal  structures  is  too  evident  to  be 
reemphasized. 


3.  The  Random  variables  in  Gaussian  and  near 
Gaussian  distributions 

It  has  been  seen  that  for  both  the  sign  and  phase  rela¬ 
tionships  as  well  as  for  the  N(z)  test,  the  assumption  of 
atoms  with  equal  weight  is  implicit.  To  achieve  this  for 
an  asymmetric  unit  with  one  heavy  atom  at  the  origin, 
Sim  (1958)  had  considered  not  ,F(h)  but  FQi)—fH 
as  the  random  variable.  This  formalism  has  been 
adopted  by  all  subsequent  workers  in  the  field. 
However,  this  device  of  removing  the  heavy  atom 
from  the  asymmetric  unit  and  considering  that  the 
light  atoms  are  randomly  distributed  is  fraught  with 
one  glaring  mistake.  The  position  of  the  heavy  atom 
being  fixed,  this  becomes  inaccessible  to  the  light  atoms 
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even  when  the  heavy  atom  in  no  longer  there.  The 
case  is  now  of  an  outstandingly  light  atom  of  weight 
zero — in  the  midst  of  an  assembly  of  light  atoms  of 
approximately  same  weight.  This  again  is  a  deviation 
from  the  Feller  (1969)  conditions  for  the  validity  of 
the  central  limit  theorem.  This  can  be  rectified  by 
filling  up  every  void  from  which  a  heavy  atom  has 
been  removed  with  a  light  atom.  Let  us  assume  that 
the  positions  rH  occupied  by  the  heavy  atoms  are 
known.  The  structure  is  then  given  by 

F  (h)  =  f  jj  exp  (i  2rr  -  h)  + 

exp  (/  2^ t  rL  •  h) ; 

the  fH’s  may  have  different  values  while  then’s  are 
nearly  equal.  SH  means  summation  over  all  heavy 
atoms  while  ZL  means  summation  over  light  atoms 
only  and  Zj  represents  summation  over  all  atomic 
positions.  If,  now,  from  the  known  positions  xH,  the 
heavy  atoms  are  removed  and  light  atoms  are  placed 
we  have 

F  (h)  —  ZH  (fH  -  fL)  exp  (i  2t 7  r^-h) 

=  Zj  fL  exp  ( 2-rri  r-h). 

In  this  fictitious  structure,  all  the  j  atoms  are  of  equal 
or  nearly  equal  weight  and  all  the  8  atomic  positions 
can  be  occupied  by  the  atoms  of  same  type  with  equal 
probability.  Thus,  the  restricted  conditions  for  the 
central-limit  theorem  will  be  satisfied  and  {.F(h) 
—  exp  (i  2n  tjj-  h)}  becomes  the  random 

variable  obeying  a  Gaussian  or  near-Gaussian  distri¬ 
bution.  The  distribution  in  this  case  as  in  the  case 
of  Sim  (1958)  is  not  a  pure  Gaussian  but  a  shifted 
Gaussian.  In  Sim’s  (1958)  case  the  shift  was  by  fH_  In 
the  present  case  the  peak  has  been  shifted  by  ZH 
{fH  —  fj)  exP  0  277  r^-h).  The  standardised  structure 
factor  for  this  fictitious  crystal  is  now 
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F  (h)  -  SH  UH  ~fL)  exp  (i  2rr  xH 
{Zjfir* 

|F(h)|2 


z  — 


2h^h  + 

will  have  to  be  replaced  by  (pVz  ±  R)2 
where  p2  =  (r2  +  r|) 


b) 

—  ,  and 


with  r\  — 


and  R  — 


ZHfH 

2  SLfL 

2  j  ft  ’ 

sh  (-4r  ~fi) 

exp  ( i  2 v  rH  •  h) 

With  this  formalism,  it  is  easy  to  see  that 
Nx  (z)  -  \  [erf  ( p  \/z  -  i?)/V 2  +  erf 

(p  Vz  +  R)lV 2],  (10) 

and 

N0  (z)  =  exp  (-R2)  f*  /0  (2pR  V?) 

exp  (— p2  z)  dz.  (11) 

For  equal  atoms,  Mitra  &  Belgaumkar  (1973)  had 
derived  the  following  equation  assuming  the  Edge- 
worth  series  as  the  distribution  law 


Ni  (z)  =  l  erf 


(12) 


where  Tz  (x)  is  the  incomplete  gamma  function. 

For  crystals  with  heavy  atoms  at  known  positions, 
it  will  now  be 

tfi(z)  =  A[erf(p  Vz-R)lV 2 
+  erf  (pV  z  +  iR)/ a/  2] 
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_l_  3  r r(pVz  -  ry  (4)  ,  ^(pVz  +  Ry  (4) 
,  2  ^  2 


4\/  7 r  L 
1 


4  y/ir  . 


(I) 


(pVJ-  J?)= 
2 

3^n 


_j_  r(P V2  +  i?)2  (f ) 
2 


(13) 


Figs.  1, 2,  3  and  4  show  comparisons  of  experimental 
data  due  to  Sim  (1958)  and  Hargreaves  (1955)  with 
theoretical  curves  based  on  equations  due  to  these 
authors  and  the  present  work.  While  Fig.  2  shows 
similar  arrangement  with  both  equations  (7)  and  (10) 
for  the  rubidium-o-nitrobenzoate,  Fig.  1  for  the 
potassium-o-nitrobenzoate  shows  a  far  superior 
agreement  with  equation  (13),  Figs.  3  .and  4  also 
show  excellent  agreement  with  equation  (10). 

From  the  above  it  is  quite  clear  that  sign  and  phase 
relationship  for  crystals  containing  heavy  atoms  will 
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Fig.  1.  Comparison  of  the  experimental  distribution  N(z)  for 
potassium  hydrogen  di-o-nitrobenzoate  (marked  by  circles) 
with  the  theoretical  distributions  and  Sim’s  plot. 


Intensity  Statistics  109 


have  to  be  modified  accordingly.  For  this  case,  the 
sign  relationship  would  be 

S(pEh-Rh)  »  S(pEh,-Rh)S(pE^h,-Rh,h,),  (14) 

and  the  probability  that  sign  of  Eb  will  be  the  same 
as  that  of  i?h  will  be  P=-{£  +  J  tanh  2 p  |  Zth.Rh  | 


Fig.  2.  Comparison  of  the  experimental  distribution  N(z)  for 
rubidium  hydrogen  di-o-nitrobenzoate  (marked  by  circles) 
with  the  theoretical  distributions  and  Sim’s  plot. 


Fig.  3.  Comparison  of  the  experimental  distribution  N(z)  for 
4-para  carbethoxyphenyl  9-stibiafluorene  (marked  by  circles) 
with  the  theoretical  distributions  and  Hargreaves  plot. 
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Fig.  4.  Comparison  of  the  experimental  distribution  N(z)  for 
Longifolin  hydrobromide  (marked  by  circles)  with  the 
theoretical  distributions  and  Hargreaves  plot. 

Thus  the  probability  that  equation  (14)  will  be  valid 
as  well  as  that  Eh  will  be  of  the  same  sign  as  Rh  will 
be 


P+=(i  +  i  tanh  2P  \  EhRh  | )  (£+  £  tanh  ^-L 
I  (PEh~Rh)  (PEh'~~Rh')  (PEh-h'~Rh-h')  I  )’ 

»  n  /’9 


(15) 


2j/l  *1*1 


P  can  be  considered  reasonably  independent  of  h 

4.  Different  types  of  distributions:  their  N(Z), 

P+  and  P (</>)  expressions 

The  treatment  in  Section  3  is  valid  where  i?h’s  are 
fully  known,  i.e.  the  positions  of  heavy  atoms  have 
been  unambiguously  identified.  In  many  problems 
this  is  not  so,  and  then  the  distribution  need  not  be 
restricted  to  Gaussian  function,  Gram-Charlier 
series  or  Edgeworth  series.  Hauptman  and  Karle 
(1953)  have  used  the  Rayleigh  distribution.  Another 
distribution  that  suggests  itself  is  the  Cauchy  distri- 
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bution,  but  it  is  inadmissible,  as  it  predicts  an  infinite 
average  intensity.  The  different  distributions,  their 
JV0(z),  iVx(z),  P+  and  P(</>)  expressions  are  tabulated 
below. 

(a)  Gaussian  distribution 

N0  (z)  =  1  —  exp  (— z) )  Howells,  Phillips  & 

Nx  (z)  =  erf  (z/2)1/2  j  Rogers  (1950) 

p+  =  %  +  i  tanh  _L  |  Eh  Eb,  Eh  W  | 

Cochran  &  Woolfson  (1955) 

p  ...  =  exp  [-  2  sin2  j  (^b-^h>-</>h„h0] 

2tt  /0  (jc)  exp  (—  2Q 

Cochran  (1955) 


where 


X  =  2N~ll2\Eh  Eh,  Eh_w 


Fig.  5.  N(z)  plot  for  centric  Gaussian  and  Edgeworth 
distributions. 
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Fig.  6.  N(z)  plot  for  acentric  Gaussian  and  Rayleigh 
distribution. 


(b)  Edgeworth  distribution  ( Mitra  &  Belgaumkar, 
1973) 


where 


Although  Mitra  &  Belgaumkar  (1973)  had  derived 
expressions  for  the  cumulative  distribution  function 
N(z),  they  did  not  calculate  the  probability  of  validity 
of  P+  of  the  sign  relationship  sb  ^  Jb_b/«1,  holding 
good  when  the  structure  factor  components  obeyed 
the  Edgeworth-series  distribution  law.  This  cal¬ 
culation,  which  is  the  prototype  of  all  other  cal¬ 
culations  to  follow,  is  shown  below. 
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The  Edgeworth  distribution  for  normalised  structure 
amplitude  is 

P(E)=  --Lexp  [— (£-<£»V2j 

V  277 

[|  +  i(£~<£»'-i(£-<£»*]. 

Evaluating  (£)  in  the  same  way  as  described  by 
Cochran  &  Woolfson  (1955),  after  some  simplification 
we  obtain 

4  (4  4-  4-h>  =  i  +  i  tanh  PAT-1"  4  4-  4-h' 

-  J  loge  (^+P)+i  \oge(A—B)],  (16) 

where 

-<  =  !  +  i  (4*  +  A'-1 4-  4h) 

-  J  (4  -  6  4  JV-1 4,  4  h,  +  AT-  4  4.,,.), 
B=  |  AT-«  1 4  4,  4-h'  |  -  i  14  4 '  4  It  I 

-  i  jv-s«  1 4  4, 4\,  | 


ft)  Rayleigh  distribution  ( Hauptman  &  Karle  1953) 


P(R)dR 


R 2‘ 

a3- 


C  2o*2  \  cr2  2cr2  / 


Wj  _3^_2 

3ct2  \  Or2 


3i?4_  i?6\  ) 

2^  6^/") 


d  R 


where  R  —  f  F  (/i/c/)  j 

Transforming  to  z  and  integrating  J  "Q  p  (z)  dz, 
we  obtain 


No  (z)  =  1— <?~z  —  Az  e~z  —  £z2  e~x  — 
C.  S.— 8 


Cz8  e~z,  (17) 


114  G.  B.  Mitra  and  Sikha  Ghosh 


where 


A= 


+  B=- 

2  A  °\ ; 


_  2^6  c 
4a*  3a*’ 


o 


A 


/=»' 


5.  Concluding  comments 

The  distributions  discussed  are  not  exhaustive.  Many 
more  distributions  are  still  to  be  explored.  The  above 
calculations  have  been  confined  to  the  case  of  PI  and 
PI  space  groups  only.  The  specialised  expressions  for 
different  space  groups  are  yet  to  be  calculated. 

Again  it  should  be  remembered  that  atomic  arrange¬ 
ments  in  an  unit  cell  are  not  of  the  nature  of  random 
walk.  It  is  a  Markov  chain  problem.  Only  one  atom 
in  a  unit  cell  may  be  placed  quite  randomly.  When  we 
bring  in  the  next  atom  it  must  be  placed  on  the  surface 
of  a  sphere  equal  to  the  interatomic  bond  length 
(cf.  Wilson,  1981,  Patterson  interpretation).  As  we 
bring  in  more  atoms,  position  of  the  nth  atom 
depends  on  that  of  (n— l)th  atom,  whose  position 
depends  on  that  of  (n— 2)th  atom,  and  so  on. 
P(n)  the  probability  of  the  nth  atom  occupying  the 
nth  position  is  then  the  Markov  chain  of  the  condi¬ 
tional  probabilities  P(n  |  n—  1  |  n— 2|  ....  |  2 1 1).  Our 
attention  and  efforts  are  to  be  concentrated  on  this 
problem  to  achieve  the  coveted  aim  of  crystal  struc¬ 
ture  determination  by  direct  analytical  methods. 
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Abstract 

Expansions  for  the  probability  density  function  of 
the  structure  factor  |  P  |  (or  |  E  |),  which  account  for 
the  known  contribution  of  heavy  atom  in  the  unit  cell 
have  been  presented.  Using  these  modified  probability 
density  functions,  expressions  for  cumulative  distri¬ 
butions  N(z)  [or  A(|E|)]  have  been  derived.  The 
expressions  are  composition  and  symmetry-dependent 
and  include  the  effect  of  atoms  in  general  as  well  as 
in  fixed  special  positions.  The  polynomial  series 
distribution  is  used  to  derive  an  expression  for  N{z)  in 
the  case  of  a  hypercentric  crystal.  The  effect  of  heavy 
atoms  on  two  phase  structure  seminvariants  has  been 
estimated  in  PI  with  the  asymptotic  form  of  the 
distribution.  Numerical  computation  has  been  carried 
out  to  study  the  effect  of  heavy  atoms,  composition 
and  symmetry  on  N(z).  The  results  have  been  com¬ 
pared  with  known  crystal  structures. 


1.  Introduction 

The  probability  distribution  functions  of  X-ray 
intensities  were  first  obtained  by  Wilson  (1949)  for 
space  groups  PI  and  PI.  In  deriving  these  distributions, 
he  assumed  that  the  unit  cell  contained  a  large  number 
of  atoms  at  random  positions  and  that  there  was  no 
outstandingly  heavy  atom  in  the  unit  cell.  Since  then  a 
number  of  tests  based  on  above  distribution  functions 
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have  been  developed  for  the  verification  of  space- 
group  assignment  and  for  the  resolution  of  space- 
group  ambiguities.  There  may  be  incorrect  conclusions 
if  any  of  the  above  assumptions  are  violated.  The 
effects  of  hypersymmetry  (Lipson  &  Woolfson,  1952; 
Rogers  and  Wilson,  1953;  Nigam,  1974)  and  out¬ 
standingly  heavy  atoms  (Collin,  1955;  Sim,  1958; 
Foster  &  Hargreaves,  1963)  have  been  the  subject  of 
numerous  studies  and  the  literature  has  been  reviewed 
by  Srinivasan  &  Parthasarathy  (1976).  In  two  recent 
publications  (Shmueli,  1979;  Shmueli  &  Wilson,  1981) 
the  cumulative  distribution  functions  N{  |  E  |  )  [or 
N(z) — Howells,  Phillips  &  Rogers,  1950]  have  been 
generalised  to  depend  on  crystallographic  symmetry 
and  composition  of  the  asymmetric  unit.  Suitable 
probability  density  functions  which  were  derived  for 
centrosymmetric  (Karle  &  Hauptman,  1953)  and 
non-centrosymmetric  (Hauptman  &  Karle,  1953; 
Srinivasan  &  Parthasarathy,  1976)  crystals  are  used. 
These  distribution  functions  depend  on  symmetry  and 
atomic  heterogeneity  of  the  crystal.  Though  the 
effects  of  heavy  atoms  in  general  positions  have  been 
considered,  heavy  atoms  in  fixed  special  positions 
have  not  been  treated  explicitly  in  the  analysis. 

The  present  paper  is  an  extension  of  the  works 
of  Shmueli  (1979)  and  Shmueli  &  Wilson  (1981).  The 
probability  density  functions  have  been  modified  in 
terms  of  a  heavy-atom-dependent  parameter  r —fpj 
(ZL)1/2.  The  expressions  are  more  general  than  those 
obtained  by  Shmueli  (1979),  as  they  include  the  effects 
of  atoms  both  in  general  and  in  fixed  special  positions. 
The  polynomial  series  distribution  is  used  to  derive  an 
expression  for  N(z )  in  the  case  of  a  hypercentric 
crystal.  The  effect  of  heavy  atoms  has  also  been 
studied  on  two  phase  structure  seminvariants  in  PI. 
Finally,  numerical  computation  is  carried  out  to  study 
the  effects  of  heavy  atoms,  composition  and  symmetry. 
Wherever  possible,  the  results  have  been  tested  on  a 
few  known  crystal  structures. 
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2.  Derivation  of  cumulative  distribution  functions 

Consider  a  crystal  structure  of  P  known  heavy  atoms 
in  fixed  special  positions  and  L  light  atoms  in  general 
positions  in  the  unit  cell  such  that  PJrL=N,  where  N 
is  the  total  number  of  atoms  in  the  unit  cell.  The 
structure  factor  of  a  reflection  h  can  be  written  as 

f=fp+fl,  (1) 

and  denote  the  local  average  of )  Fj 2,  |  Fp  (2  and  |  FL  |2 
by  U,  2p  and  SL  respectively.  That  is, 

N  P 

<  l*T>  =  ]>>.'  =  <l^ls>=  2  /«  =  sr- 

1  =  1  1  =  1 

L 

<ifii!>=  2/z,=^-  © 

i=i 

The  probability  that  |  FI  lies  between  |  F|  and  |  F  |  -f 
d  |  F  |  can  be  derived  from  the  conditional  distribution 
P(|  F| ;  |  Fp  |)  by  using  the  result 

P(  |F| )  =  /P(|  FI ;  |  Fp  | )  P{\  Fp  |)  d  |  Fp  |.  (3) 


2-1  Centric  case 

Since  FL  follows  the  centric  distribution,  its  proba¬ 
bility  density  function  can  be  written  as  (Karle  and 
Hauptman,  1953) 


1  l  fl\ 

pc  (Fi)  -  (2t7  Zl)^  exp  \  2ZJ 
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where 

A  =  «\FL\,)c-3tf)ISZb  (5) 

+  30  S*QI 48  sb  (*) 


From  (1)  and  (4)  we  obtain  the  conditional  distribution 
Pc(|F|;|Fp|)to  be 


jy|r|;|fi>|)  =  ^=  «p(- 


(l^l-l^piv 


2  4 


+  r 


+  5 


/ 1  (1^1  — 1^pI)6_0^1-|^pI> 


sl 


+  3 


\15 


*L 


P-fHd-FF 


r  /Klfl  +  I^DFo^l  +  I^DF, 

L  \3  SI  Sr 


+  B 


n  (|f|  +  |ff|)8_  (|f|  +  jff|)4 


\  1 5 


2i 


z l 


+  3 


(imi*>D2 


-44 


(7) 


Equation  (7)  can  be  made  more  compact  when 
polynomials  appearing  in  it  are  expressed  in  terms 
of  Hermite  polynomials,  Hn(x )  using  the  following 
identities 


>*-*.+ l-^y.  (8, 


(9) 
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For  simplicity,  we  consider  the  case  of  an  outstanding¬ 
ly  heavy  atom  at  a  fixed  position  and  L  heterogeneous 
atoms  in  general  positions  in  the  unit  cell.  In  this  case 
the  distribution  of  Fp  is  a  delta  function  given  by 


P(Fp)  =  S(Fp-fp), 


(10) 


where  fp  is  the  atomic  scattering  factor  of  heavy  atom. 
Using  (3),  (7),  (8)  (9)  |and  (10)  we  obtain  P(|F|) 
d  |  Fj  as 

m)+r» AW-V 


u+± 
(  12 


H, 


(11) 


Substituting  z  =  |  F[2/(^x  +  /*)  and  r  =  one 

gets  from  (11) 

Pc(z)dz  =  H(1  +  r2)/27rz}1'2 

[(l+r2)1/2z1/2-r]2) 


1 


(1  +  r2)1/2  z1/2  —  r 

vT 

z1, 

V2 

[(l+r2)1/2  z1/2+r]2^ 


) 


x  |exp  |  — 

+  exp  I— 

X  [l  +^Ht((l+rS)m/K+rj 
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The  cumulative  distribution  of  z  is  obtained  by 
computing  the  integral  P(z')  dz',  which  gives 
Nc(z)  =  fz  P(z’)  dz  or  in  terms  of  |  E  |  by  setting 
z  =  \E\*  as 


ItM  =  i  erf((1+r^2zW+r) 

—  I -j  exp  [  —  {(1+r2)122  z1/2-f-r}-2/2 

H  /( 1  +  r2)1/2  z1/2 + r\ 

12  3\  V2  / 

B  /(I  +  r2)1/2z1/2  +  r\ 

+  120  - V2 - ) 

+  i«f(e±^=r)' 

—  llj  exp  f—  {(I  I'/"2)1,2  z1/2— r}2/2 


24  „  /(l+r2)l/22l/2_r 


+  n  3( 

+ 


12 

B 


120 


V2 

/(l+r2)1/2  zV2—rY 
l  V2  /. 


iVe(|£|)  =  ierf£  Ep 


V2  VlT 


(13) 


(14) 
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where  E  and  Ep  are  the  normalized  structure  factors 
of  total  and  heavy  atom  components  respectively. 

2.2.  Acentric  case 


The  suitable  density  function  for  FL  is  given  by 
(Hauptman  and  Karle,  1953) 


1  / 

Fl\ 

( 1  Ff 

Fr 

Ufl)  = 

*Vxp  l 

+ 

1  +  C 

2  -A 

+1) 

/I  Ff 

3  Ft 

Ft 

\ 

1 

+  D 

-  _±  3_±  - 

(15) 

where 

c  =  « 

^l‘>. 

(16) 

+  12  2£)/6  S'v  (17) 

Since  F  is  a  vector,  we  may  rewrite  (1)  as 

=  i^l2  +  !  Fp\2  —  2  l^l  |i*)>|  cos  a  (18) 

where  a  is  the  angle  between  F  and  Fp.  From  (15)  and 
(18),  integrating  with  respect  to  a  in  the  range  0  to  2tt, 
Pa  (|F|;  |Fp|)  is  obtained.  The  relation 

^  exp  (z  cos  a)  cos  ma  da  =  2tt  Im(z),  (19) 

where  /m(z)  is  the  modified  Bessel  function,  is  used  for 
integration.  From  Pa  (|  F\;  \  Fp  |),  after  proper  simpli¬ 
fication  for  single  heavy  atom,  using  (10)  and  substi¬ 
tuting  for  z,  we  finally  arrive  at 

Fa(z)  dz  =  (1+r2)  exp  —  {r2+z  (1+r2)}  [70  (kzm) 

+  C  {(i  (z  (1+r2)  +  r2)2  +  z  (l+r2)-r2)  I0  (kz1'2) 
-  2  {r2  +  (1+r2)  z}  70  (kz'*)  +  70  (kz'*)}] 

+  D{\  [(z  (1+r2)  +  r2)3  +  6  (z  (1+r2) 

+  r2)  (z  (l+r2)-r2)]  70  (kz'*)  —  §  [(z  (1+r2)  +  r2 
+  2z  r2  (1+r2)]  70  (A:z1/2)  -  3z  r2  (1+r2)  70  (kz1*) 
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+  3  (z  (1+r2)  +  r2)  70  (kzx'2)  -  70  (kz^)} 

+  C  {— (2z  (1+r2)  +  r 2)  z x'2  (1  +r2)x'2-r  7X  (fcz1'2) 
+  zr2  (1+r2)  72  (/cz)1/2  +  4r2  z  (1+r2)  7X  (Arz)1/2} 

+  Z>  [z1/2  r  (1+r2)  (z  (1+r2)  +  r2)2 
+  (z  (1+r2)  +  r2)3]  7X  (kz1'2)  +  (z  (1+r2)  +  r2) 

X  z  (1+r2)  r2  72 {kzxl2)-\ z3'2  (1+r2)3/2  r3  73  (fcz1'2) 
+  6  (z  (1+r2)  +  r2)  z1/2  r  (1+r2)  7X  (kzX12) 

-  6  z1/2  r  (1  +r2)1/2  7X  (A:z1/2),  (20) 

where 

k  —  2r  (1  +  r2)1/2. 

The  cumulative  distribution  is  obtained  in  a  usual  way 
by  numerical  technique.  We  have  considered  C207,  a 


Fig.  1.  Theoretical  N(z )  distribution  curves  for  1,  2/m,  rnmm 
(dashed  lines)  compared  with  Wilson’s  centric  and  acentric 
curves  (solid  lines)  for  C20 1. 
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hypothetical  crystal  structure  (Shmueli,  1979).  The 
values  of  A  and  B  for  different  centrosymmetric  point 
groups  are  known.  The  value  of  r  is  taken  as 


r  =  fp  =  ft 

(^,)1/2  (27c)i/2 


at  sin0=O-45. 


The  results  of  calculations  are  shown  in  Fig.  1.  It  is 
seen  that  the  N(z)  functions  for  different  centric  point 
groups  as  calculated  from  (13)  are  more  clustered 
around  Wilson-type  acentric  case.  It  is  consistent  with 
Sim’s  analysis  and  may  be  attributed  to  the  heavy 
atom  effect.  Fig.  2  shows  the  experimental  N(z)  plot 
for  rubidium  di-o-nitrobenzoate  (Sim,  1958)  compared 
with  cumulative  distribution  function  as  calculated 
from  (13)  and  compared  with  Sim’s  theoretical  values. 
It  is  observed  that  the  theoretical  curve  as  calculated 
from  (13)  agrees  better  than  that  fitted  by  Sim  (1958). 


Fig.  2.  Comparison  of  experimental  N(z)  points  (marked  by 
circles)  of  Rubidium-o-dinitrobenzoate  with  (1)  Theoretical 
distribution  from  (13)  and  (2)  Sim’s  curve. 
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3.  Derivation  of  hypercentric  distribution 

It  is  well-known  (Wilson,  1949)  that  pseudosymmetry 
can  give  rise  to  abnormal  intensity  distribution.  The 
effect  has  been  studied  using  Wilson  statistics.  We 
examine,  in  what  follows,  hypercentric  intensity 
distribution  using  an  expansion  of  probability  function 
as  given  in  (4).  It  is  thus  possible  to  include  the  effect 
of  symmetry  and  composition  on  the  corresponding 
cumulative  distribution  functions.  The  hypercentric 
intensity  distribution  arises  when  two  molecules,  each 
with  an  inversion  centre  occupy  general  position  in  a 
centrosymmetric  space  group.  Following  Lipson  and 
Woolfson  (1952)  and  a  suitable  form  of  probability 
density  distribution,  the  probability  that  Flies  between 
F  and  F  +  dF  is  given  by 


Ph (F) d F=  \ (4n2)~ln  jUuZ1_l  exp  (-F2  sec2  \-nu^E) 


sec^7rwdFd«,  (21) 


where  i/=4s.r',  s  is  the  reciprocal  vector  and  r'  is  the 
position  of  molecular  centre  of  inversion  relative  to 
that  of  the  unit  cell.  Substituting  t  —  tan  \ttu  and 
z  =  F2{E,  one  gets  from  (21) 

Ph  (2)  dz  =  7,-1  (47,  z)-1/2  J'=c°  exp  {  -  \z  (1  +  t2)} 

x[l+^4(zl^12+^1/2) 


+  bh6 


fzl/2  (1  _|_  ,8)1/21 


1  + 


d t  dz.  (22) 


The  cumulative  distribution  function  Nh(z )  follows 
from  (22)  as 

W  =  "T-1  (4w)-1/a  J'o  J"  2- w  exp  i  (1  +  (2)  j 
<zm  (1  + 


[l  +  AH 4  (: 


+  BH, 


( 


zm  (1  +  ,2)1/2 


j  +  ••• 


dt  dz. 
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2a2 


,  the  integration  with 


Setting  t  =  tan  ifj  and  z  = 

1  +  t2 

respect  to  z  can  be  evaluated  and  so  one  gets 


iVz  ,\  /  z  „  ,\  , .  45  p 

(  —  sec  ip]  exp  —  -  sec2 4>  )chp  —  —  I 
\  2  /  \  4  1  tt*/2  J  0 


45  f77/2 


H , 


sec  i exp  ^  —  isec2  t/<j  (23) 


The  expression  for  Nh(  |  E  | )  follows  from  (23)  by 
substituting  E2  —  z.  The  integrals  in  (23)  were  evaluated 
numerically  in  order  to  compare  the  experimental 
N(z)  distribution  (hoi  projection)  data  of  pyrene 
(C6  H10)  which  is  an  example  of  hypercentric  crystal. 
Fig. 3  shows  the  results.  Fig.4  shows  the  N(z)  plot  from 
( hkl )  data  of  7,7,8-8-tetracyanoquino-dimethane- 
phenazine  complex  (Goldberg  and  Shmueli,  1973).  It 


Fig.  3.  Comparison  of  experimental  iV(z)  points  (marked  by 
O)  of  Pyrene  (hoi  projection)  with  (1)  bicentric  distribution 
due  to  Lipson  &  Woolfson  (1952)  and  (2)  Theoretical 
distribution  from  (23). 
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Fig.  4.  Comparison  of  experimental  N(z )  points  (marked  by 
circles)  of  7,  7,  8,  8,  tetracyanoquinodimethane-phenazine 
complex  with  (1)  bicentric  curve  due  to  Lipson  &  Woolfson 
(1952)  and  (2)  Theoretical  distribution  from  (23). 

is  seen  that  agreement  in  both  the  cases  is  fairly  good. 
The  theoretical  curve  shows  a  slight  tendency  to  be 
displaced  towards  Wilson’s  centric  distribution.  This 
may  be  attributed  to  the  inclusion  of  higher  order 
terms  in  the  polynomial  density  distribution  function. 

4.  Heavy-atom  contribution  and  two  phase 
structure  seminvariants  in  space  group  PI 

In  space  group  PI,  the  linear  combination 

</'  =  </>h  +  <f>k,  (24) 

is  a  structure  seminvariant  if  and  only  if 

h  +k  =  0  mod  (at),  (25) 

where  at  =  (2,  2, 2) 

The  normalized  structure  factor  of  reflections  h  (hr  k1  lx) 
and  k  (h2  k2  /2)  can  be  written  as 

Z1/2E(  h)  =  Z}!2Ep(  h)  +  2lL/2EL(h),  (26) 

E1  /2£(k)  =  Ep2Ep(  k)  +  Z£2El( k)  (27) 
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where  Ep  and  EL  are  the  normalized  structure  factors 
of  the  heavy-atom  component  and  the  light-atom 
component  respectively.  The  number  of  light  atoms 
is  assumed  to  be  large  so  that  Wilson’s  centric  distri¬ 
bution  is  valid.  The  joint  probability  distribution 
P(E(h),  E( k),  Ep{ h),  2ip(k))  can  be  written  as  (Klug, 
1958) 

P(E(h),  m,  EP( h),  Ep(k))  =1  exp  -  *  [(£(h)-£p(h))2 

+  (E( k)  -  £p(k))2]  (28) 

The  conditional  joint  probability  distribution  P(.E'(h), 
jE’(k) ;  Ep( h),  iTp(k))  can  be  readily  obtained  from  the 
relation 

E(E(h),  F(k);  Ep(h)Ep(k)) 

P(E(h),E(k),Ep(h),Ep(k)) 

— - (29) 

J®  oo  J  ®  ^(^h)  >  £(k) ,  F/>(h) ,  £^(k))d£(h)d£(k) 

From  (28)  and  (29)  one  gets 

P(E(h),  E(k);Ep(ti),  Ep(k)  =  I  exp — |  [(i?(h)— Ep(h))2 

2tt 

+  (E(k)-Ep(k)f).  (30) 

Equation  (30)  may  be  used  to  calculate  (F(h)F(k); 
Ep(ti)Ep(k))  as  (E(h)E(k);  Ep(h)Ep(k)) 

= «(b»)W(k);«p(b)^(k)) 

d£(h)d£(k)  =  Bp(h}EpK).  (3 1 ) 

Similarly 

<£2(h)£2(k);  E’p(h)JE/>(k))=J^Q0  J^ooE2(h)E2(k) 
P(E(ti)E(k) ;  £’p(h)Fp(k))d£’(h)d£'(k)  —  1  +E*(h)+E*p(k) 
+F|(h)F2(k).  (32) 

We  may  now  expand  the  conditional  probability 


130  G.  D.  Nigam  and  Sikha  Ghosh 


distribution  of  the  random  variable  R=E(h)E(k)  in 
Gram-Charlier  series  (Cramer,  1951):  we  obtain 


P(R  ;Ep( h),  Ep(k))  =  — L  exp  [- 

Lira1,  L 

where  <(/?>  is  given  by  (31),  and 


(/?-<i?»n 

2ct2  J 

(33) 


<x2=l+£|(h)  +  £|(k).  (34) 

If  one  denotes  the  probability  that  P+  has  the  same 
sign  as  £p(h)  Ep( k),  then  it  can  readily  be  shown  that 
(Klug,  1958;  Giacovazzo,  1975) 


P+  =  \  \  tanh 


|  £(h)£(k)  1 1  ^(h^k)  | 
1  +£p(h)  +  E*(k) 


(35) 


Equation  (35)  can  be  used  to  estimate  the  sign  of  the 
product  J^h^k).  This  formula  fulfils  the  requirement 
that  P+=\  whenever  .Ep(h)=0or  £p(k)=0.  Equation 
(35)  was  tested  on  hko  data  of  metanilic  acid  (Hall 
&  Maslen,  1965).  In  this  structure  2 /fight  >  2 /^eavy, 
thus  indicating  that  the  contribution  from  heavy 
atoms  is  not  large  enough  to  dominate  the  phases 
of  the  structure.  The  results  of  calculations  are 
presented  in  Table  1.  It  is  observed  that  whenever 
P+  is  large  (P+  >  0-7)  the  product  T^h^k)  has  the 
same  sign  as  that  of  Ep(h)Ep(k).  It  is  further  noted 
that  the  probabilities  of  the  sign  of  the  ii(h)ii(k)  are 
always  dominated  by  heavy  atom  contribution  to  the 
normalized  structure  factors  and  the  probabilities  are 
high  for  large  products.  The  probabilities  of  sign  indi¬ 
cations  are  always  poor  when  the  structure  factors 
involved  are  small  in  magnitude. 


5.  Conclusions 

The  present  study  confirms  that  the  modified  probabi¬ 
lity  density  functions  are  well  suited  for  the  evaluation 
of  iV(  |  E  | )  or  N(z )  functions,  as  the  effects  of  heavy 
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Table  1.  Probability  calculations  for  signs  of  reflections 


for  metanilic  acid 

E( h)  £(k) 

Sign  of 
£(b)  £(k) 
observed 

P+  from 
(35) 

Sign 

£i>(h) 

^1,  1 

El,  3 

+ 

0-57 

— 

El,  3 

Ei,  i 

+ 

0-70 

+ 

El,  6 

El,! 

+ 

0-66 

— 

El ,T 

El'  9 

— 

0-62 

+ 

El,  9 

Ei,  ii 

+ 

0-57 

— 

Ei,  ii 

El,  13 

— 

0-54 

+ 

El,  8 

Ei’  io 

+ 

0-55 

— 

El,  12 

El,  1« 

+ 

0-70 

+ 

Ei,  5 

Ei, , 

+■ 

0-99 

+ 

Ei,  9 

Ei,  ii 

— 

0-99 

— 

Ei,  i 

El,  3 

— 

0-85 

— 

Ei,  ii 

El,  13 

+ 

0-75 

+ 

atoms,  symmetry  and  composition  of  the  asymmetric 
unit  are  included  in  the  final  expressions.  In  case  of 
light-atom  hypersymmetric  structures,  the  cumulative 
distribution  functions  given  by  Lipson  and  Woolfson 
(1952)  and  Rogers  and  Wilson  (1953)  are  accurate 
enough  to  detect  the  hypersymmetry.  The  effect  of 
higher-order  terms  is  not  very  significant,  but  the 
situation  may  change  altogether  if  hypersymmetry 
and  an  outstandingly  heavy  atom  are  present  simul¬ 
taneously. 

The  formula  (35)  may  be  used  together  with  known 
structural  information  to  compute  the  phases  of 
structure  seminvariants  which  are  essential  in  any 
phase-determination  process  by  direct  methods.  It 
would  be  interesting  to  have  a  formula  for  the  sign 
of  structure  seminvariants  in  which  both  structural 
information  and  neighbourhood  concept  are  included 
simultaneously.  This  is  the  subject  of  further  investi¬ 
gation. 
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Measurability  of  Bijvoet  Differences* 
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Abstract 

The  determination  of  the  absolute  configuration  of 
molecules  and  structure  determination  of  non- 
centrosymmetric  (NC)  crystals  with  heavy  atoms  are 
two  of  the  important  applications  of  anomalous 
scattering  with  phase  shift.  A  statistical  method  of 
selecting  the  few  optimum  reflections  for  Bijvoet 
difference  (BD)  measurement  (at  the  stage  where  the 
positions  of  the  heavy  atoms  are  known)  for  the 
purpose  of  determining  the  absolute  configuration  is 
pointed  out.  The  success  of  the  anomalous  scattering 
method  for  determining  the  structures  of  NC  crystals 
depends  on  the  possibility  of  measuring  accurately 
the  BDs  of  a  large  percentage  of  reflections.  The 
measurability  of  BDs  can  be  studied  from  a  know¬ 
ledge  of  the  probability  distribution  of  normalized 
BD  variables  or  the  Bijvoet  ratio.  The  measurability 
is  expected  to  be  influenced  by  structural  features 
(e.g.  presence  of  centrosymmetric  parts  in  the  mole¬ 
cule,  space-group  symmetry  and  the  degree  of 
centrosymmetry  of  the  crystal)  as  well  as  the  non¬ 
observability  of  extremely  weak  reflections.  After 
dealing  with  the  optimum  conditions  for  observing 
large  BDs  in  a  perfectly  NC  crystal,  the  influence  of 
various  structural  features  and  of  data  truncation  on 
the  measurability  are  considered. 

1.  Introduction 

If  the  wavelength  of  incident  X-rays  is  close  to  and 
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less  than  the  absorption  edge  of  an  atom  the  scattering 
from  the  atom  becomes  anomalous  such  that  the 
scattering  factor  has  a  component  which  is  90°  ahead 
in  phase  with  respect  to  normal  scattering.  When  there 
is  anomalous  scattering  with  phase  shift  (see  Rama- 
seshan,  1964  for  the  terminology)  the  intensities  of 
the  inverse  reflections  H  (=/?  k  /)  and  H  (=h  k  l )  of 
a  non-centrosymmetric  (NC,  hereafter)  crystal  are  no 
more  equal  resulting  in  the  breakdown  of  Friedel’s 
law  (James,  1958).  The  difference  in  the  intensities  of 
the  inverse  reflections  is  called  the  Bijvoet  difference 
(BD,  hereafter)  owing  to  the  pioneering  work  of 
Bijvoet  on  the  crystallographic  applications  of  this 
difference  (Bijvoet,  1952,  1954,  1955).  Two  of  the 
important  uses  of  BD  measurement  are:  (i)  deter¬ 
mining  phases  of  reflections  in  NC  crystals  and 
using  them  to  elucidate  the  structure  (Ramachandran 
&  Raman,  1956;  Peerdeman  &  Bijvoet,  1956) 
and  (ii)  establishing  the  absolute  configuration  of 
molecules  or  atomic  arrangement  in  NC  crystals 
(Bijvoet,  Peerdeman  &  van  Bommel,  1951).  While 
structure  determination  of  an  NC  crystal  by  the 
anomalous  scattering  method*  via  either  the  quasi- 
anomalous  synthesis  (Ramachandran  &  Raman, 
1956;  Ramachandran  &  Parthasarathy,  1965)  or  the 
weighted  anomalous  synthesis  (Parthasarathy,  Rama¬ 
chandran  &  Srinivasan,  1964;  Sim,  1964)  requires 
the  accurate  measurement  of  the  BDs  of  a  fairly  large 
percentage  of  reflections,  the  determination  of  the 
absolute  configuration  by  the  Bijvoet  method  requires 
the  measurement  of  BDs  of  a  few  (a  dozen,  say) 
reflections  only.  With  respect  to  these  applications  we 
shall  use  the  concepts  of  ‘measurability  of  BDs  of  a 


*We  use  the  term  ‘structure  determination  by  anomalous 
scattering  method’  to  mean  the  determination  of  the  positions 
of  a  sufficient  percentage  of  atoms  in  the  structure  from  either 
the  quasi-anomalous  or  weighted  anomalous  synthesis.  The 
rest  of  the  atoms  in  the  unit  cell  can  then  be  determined  by  the 
standard  Fourier-methods  (see  Ramachandran  &  Srinivasan, 
1970). 
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crystal’  and  ‘measurability  of  BD  of  a  reflection’. 
The  measurability  of  BDs  of  a  crystal  is  concerned 
with  the  determination  of  the  suitability  of  a  crystal 
for  BD  data  collection  for  the  purpose  of  structure 
determination  while  the  measurability  of  BD  of  a 
reflection  is  concerned  with  the  determination  of  the 
suitability  of  any  given  reflection  for  BD  measurement. 
In  this  article  we  shall  define  the  two  measurabilities 
quantitatively,  using  probability  criteria.  We  shall  then 
discuss  the  optimum  conditions  for  the  measurability 
of  BDs  of  a  crystal  and  study  how  this  measurability 
is  influenced  by  a  number  of  structural  and  non- 
structural  features.  We  shall  also  discuss  how  the 
concept  of  measurability  of  BD  of  a  reflection  can  be 
used  for  selecting  the  few  optimum  reflections  for 
BD  measurement  for  the  purpose  of  establishing  the 
absolute  configuration. 


2.  Notation  and  preliminary  results 

Consider  an  NC  crystal  containing  N  atoms  in  the 
unit  cell  of  which  P  atoms  are  anomalous  scatterers 
and  the  remaining  Q(=N—P)  atoms  are  normal 
scatterers.  In  the  case  of  Wrays  the  anomalous 
scatterers  are  generally  heavy  atoms  while  the  normal 
scatterers  are  light  atoms  such  as  C,  N  and  O.  In  our 
discussion  we  shall  take  all  the  anomalous  scatterers 
in  the  asymmetric  unit  to  be  of  the  same  type  and  the 
normal  scatterers  to  be  of  similar  scattering  power. 
Heavy  atom  derivatives  of  most  organic  and  bio¬ 
molecules  come  under  this  situation. 

The  scattering  factor  of  an  atom  under  anomalous 
scattering  with  phase  shift  is  a  complex  quantity  and 
can  be  written  as 

f  =  fo+f'  +  i  /",  (1) 

where  f0  is  the  high  frequency  limit  of  the  scattering 
factor  and  /'  and  f"  are  the  real  and  imaginary  dis¬ 
persion  corrections  (for  this  notation  see  Ramaseshan 
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&  Abrahams,  1975).  In  the  theoretical  probability 
distribution  functions  of  BD  and  Bijvoet  ratio  (BR, 
hereafter)  the  ratio  of  the  imaginary  to  the  total  real 
part  of  the  atomic  scattering  factor  of  the  anomalous 
scatterer  enters  as  one  of  the  parameters  of  the 
distributions.  We  shall  hence  denote  it  by  k.  That  is 

*  =  /7(/o  +  /')•  (2) 

The  structure  factor  of  a  reflection  H  can  be  written 
in  terms  of  the  contributions  from  the  P-  and  Q- 
atoms  as 

Fn  (H)  =  F'p  (H)  +  F"p  (H)  +  Fq  (H) 

=  F'n  (H)  +  Fp  (H),  (3) 

where 

Fjsr(H)  =  F>(H)  +  T’e(H).  (4) 

Fp(H)  and  Fp(H)  are  the  contributions  to  the  structure 
factor  of  reflection  H  from  the  real  and  imaginary 
parts  respectively  of  the  atomic  scattering  factor  of 
the  F-atoms  and  Fg(H)  is  the  contribution  to  the 
structure  factor  from  the  <Q-atoms.  The  structure 
factor  relations  for  the  inverse  reflections  H  and  H 
in  the  presence  of  anomalous  scattering  can  be 
conveniently  represented  in  the  Argand  diagram  (Fig. 
1).  Here  the  diagram  for  reflection  H  is  shown  reflected 
about  the  real  axis  in  order  to  clearly  show  the  occur¬ 
rence  of  the  BD.  In  crystals  with  a  single  species  of 
anomalous  scatterer  we  can  show  that 

F;(H)=/  k  Fp(H),  (5) 

which  implies  that 

|F;(H)|  =k\F'p(li)\,  4(H)  =  4(H)  +  ^.  (6) 

We  shall  define 

8  —  a^  rip,  ifj=  a.Q  a p. 


(7) 
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Fig.  1 .  Argand  diagram  representation  of  the  structure  factor 
relations  for  the  inverse  reflections  H  and  H. 

Using  geometrical  considerations  theoretical  expres¬ 
sions  for  the  BD  and  the  mean  intensity  of  the  inverse 
reflections  H  and  H  can  be  derived  from  Fig.  1  as 


A/=/(H)-/(H)=4  |  F'n\  | 

F"p\  sin  0, 

(8) 

7=i[/(W)+/(5')]=|Fiv 

]2+  |f;|2 

(9) 

where  (8)  is  the  Ramachandran-Raman-Peerdeman- 
Bijvoet  formula  for  BD  and  is  the  starting  point  for 
the  theoretical  derivation  of  the  probability  distri¬ 
butions  of  the  BD  variables  for  different  situations. 


2.1.  Normalized  Bijvoet  difference  and  Bijvoet  ratio 
variables 


In  the  theoretical  studies  on  the  measurability  of 
BDs  a  number  of  noramlized  BD  and  BR  variables 
have  been  defined  and  used.  The  noramlized  BDs  * 
and  A  are  defined  to  be  (Parthasarathy  &  Srinivasan, 
1964;  Parthasarathy,  1967). 


I  A/l 

X“  4[(  |  Fq  |2}  <  |  Fp  |  )]1/2  ’ 
A=|A/j/<|i>|“>. 


(10) 


01) 
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The  Bijvoet  ratio  X  is  defined  to  be  (Zachariasen, 
1965) 

A-=  lA/l  _ [  A/ [ _ 4|fy|  | 1 1  sin* | 

i[/(»)+/(5)j  >  w+w 

For  X-rays,  in  most  cases  \Fpf<^  \F'N\2  so  that  \Fp\l 
in  the  denominator  of  (12)  may  be  neglected  (parti¬ 
cularly  when  k  is  not  large)  in  comparison  with  jFj^j2 
(Parthasarathy,  1967).  We  denote  the  resulting 
quantity  by  8  and  call  it  the  modified  BR.  That  is 

8=  \Al\l\F’N\>=4\Ff,\\sm9\l\F'N\.  (13) 

For  the  theoretical  treatment  it  is  convenient  to  express 
these  in  terms  of  the  normalized  structure  factor 
magnitudes  y’N  y'p  and  yQ  which  are  defined  as 


y'a=\K\K\K\‘)m>  «=N  °lP- 

(14) 

From  (8)— ( 1 4)  we  can  readily  show  that 

x=y’P  yQ  |  sin  if/ 1, 

(15) 

A=4  ka1a2x, 

(16) 

8=4kcr1y p  |  sin  d\/y'N, 

(17) 

I  sin  0\ 

y%+k*al/>  ’ 

(18) 

where  a\  and  o\  are  defined  to  be 
°-i=(\Fr\*)K\F'Nn,^(iFQm\F'Nn 

(19) 

It  is  useful  to  note  here  that  though  the  BD  and  BR 
of  a  reflection  can  take  both  positive  and  negative 
values,  we  have  defined  the  variables  x,  a>  X  and  8 
to  be  positive.  This  is  because,  for  studying  the 
measurability  the  relevant  quantities  are  only  the 
magnitudes  of  these  variables. 
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2.2.  Nomenclature 

In  our  study  we  use  the  terms  one-atom,  two-atom, 
many-atom  ( P=MN )  and  many-atom  ( P=MC ) 
cases  in  the  following  sense:  Consider  a  crystal  of 
space  group  PI  containing  P  heavy  and  Q  light  atoms 
in  the  unit  cell.  The  situation  where  P—  1  is  called  the 
one-atom  case  and  that  where  P=2  the  two-atom  case. 
When  P  is  many,  the  P-group  can  take  up  either  an 
NC  or  a  centrosymmetric  (C,  hereafter)  configuration 
and  these  two  situations  are  called  the  many-atom 
( P=MN )  case  and  many-atom  (P=MC)  case  respec¬ 
tively. 


3.  Measurability  of  Bijvoet  differences  of  a  crystal 

3.1.  Factors  influencing  the  measurability* 

The  success  of  the  X-ray  anomalous  scattering 
method  for  structure  determination  via  Fourier 
methods  strongly  depends  on  the  measurability  of  BDs, 
that  is,  on  the  possibility  of  measuring  fairly  accurately 
the  BDs  of  a  large  percentage  of  reflections.  The 
factors  which  are  expected  to  influence  the  measura¬ 
bility  can  be  grouped  into  two  classes,  namely, 
(1)  structural  and  (2)  non-structural.  Examples  for  the 
former  are :  (i)  space-group  symmetry,  (ii)  presence  of 
a  centrosymmetric  part  in  a  molecule  [i.e.  the  degree 
of  centrosymmetry  (DCS,  hereafter)  of  the  molecule] , 
(iii)  DCS  of  the  crystal  as  a  whole,  (iv)  the  presence  of 
pseudosymmetry  in  the  atomic  arrangement  in  the 
crystal  structure,  etc.  All  these  structural  features  may 
not  coexist  in  a  particular  crystal  structure. 

One  of  the  non-structural  factors  which  may  be 
expected  to  affect  the  measurability  arises  from  the 
limitation  of  physical  measurements.  It  is  known  that 
in  a  crystal  not  all  the  theoretically  possible  reflections 


*In  §3  we  shall  use  the  term  ‘measurability’  to  stand  for 
‘the  measurability  of  BDs  of  a  crystal’. 
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in  a  given  (sin 0/A)  -range*  can  actually  be  observed. 
There  always  exists  a  finite  percentage  of  reflections 
which  are  too  weak  to  be  observed.  Owing  to  the  non¬ 
observability  of  extremely  weak  reflections  the  observ¬ 
ed  data  will  suffer  a  truncation  at  the  lower  end. 
Such  a  truncation  could  affect  the  measurability.  A 
study  of  this  aspect  is  also  important  since  large  BRs 
have  been  generally  believed  to  occur  among  extremely 
weak  reflections  (Ramachandran  &  Srinivasan,  1970). 
The  other  non-structural  factor  affecting  the  measur¬ 
ability  is  the  wavelength  of  the  radiation  used  for 
data  collection.  It  is  obvious  that  the  closeness  of  the 
incident  wavelength  to  the  absorption  edge  of  an 
atom  could  result  in  enhanced  anomalous  scattering 
with  a  consequent  increase  in  the  measurability. 

3.2.  Bijvoet  difference  variables  used  for  studying  the 
measurability 

The  measurability  of  BDs  of  a  crystal  can  be 
studied  by  obtaining  the  probability  distribution  of 
either  the  BR  X  or  the  normalized  BD  variables  x  or 
A.  Of  these  the  BR  is  the  best  for  such  studies. 
Owing  to  the  somewhat  complicated  functional 
dependence  of  X  on  9,  y'N  and  y'p  the  distribution 
function  of  X  cannot  be  obtained  in  a  closed  form.  In 
some  of  our  studies  we  have  therefore  made  use  of  the 
modified  BR  8  and  this  to  some  extent  helped  to 
circumvent  the  theoretical  difficulties.  However,  it  is 
important  to  note  here  that  the  results  on  measura¬ 
bility  obtained  from  a  study  of  the  probability  distri¬ 
bution  of  8  could  be  somewhat  an  overestimation  of 
the  effect,  particularly  when  the  anomalous  scattering 
is  quite  pronounced  (i.e.  when  k  is  large).  In  our 
early  papers  we  used  only  the  variables  x:  and  A  in 
order  to  obtain  the  theoretical  distribution  functions 


*We  shall  denote  sin0/A  by  S  and  the  maximum  value  of  S 
for  the  data  by  Smav  Here  6  stands  for  the  Bragg  angle  and 
is  different  from  that  in  (7). 
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in  closed  form  and  this  in  turn  made  the  evaluation  of 
these  probabilities  by  manual  computation  easy. 

3.3.  Definition  of  the  measurability 

A  probability  measure  that  is  suitable  for  expressing 
the  measurability  of  BDs  of  a  crystal  is  the 
complementary  cumulative  function  (CCF  hereafter) 
of  the  BR*  and  this  is  denoted  by  Nx(Xf)  where  X0 
is  a  particular  value  of  the  random  variable  X.  That  is, 

Ncx(X0)  =  Pr  (X  >  X0),  (20) 

which  denotes  the  probability  that  the  BR  X  takes  a 
value  greater  than  any  specific  value  X0.  Physically 
Ncx(X0)  represents  the  fractional  number  of  reflections 
for  which  the  magnitude  of  the  BR  is  greater  than 
any  given  value  X0.  Since  a  BR  which  is  of  the  order 
of  0T  (i.e.  10%)  can  be  easily  measured  we  shall  treat 
A^(O-l)  and  ‘measurability’  as  synonymous  in  our 
discussion. 

3.4.  Parameters  characterizing  the  measurability 

From  (18)  it  is  clear  that  the  CCF  of  X  will  depend 
on  the  parameters  k  and  <r|.  Since  X  is  a  function  of 
y'N,  y'p  and  6  (see  (18))  the  probability  distribution  of  X 
can  be  obtained  from  the  joint  probability  distribution 
function  of  y'N,  y'p  and  9.  As  ^depends  on  the  number 
of  P-atoms  in  the  asymmetric  unit  and  their  configu¬ 
ration  the  CCF  of  X  may  be  expected  to  depend 
on  these  characteristics  of  the  P-atoms.  Further 
in  crystals  with  special  structural  features  the 
specific  parameter  that  characterizes  the  structural 
feature  will  also  be  a  parameter  of  the  CCF  of  X. 
Thus  though  the  measurability  depends  on  k,  c^, 
the  parameter  defining  special  structural  feature,  etc., 


*The  CCF  of  A,  namely,  N%  (A0)  can  also  be  used  for  this 
purpose. 
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the  relative  importance  of  these  factors  can  be  deter¬ 
mined  only  from  a  study  of  the  behaviour  of  the  CCF 
of  X  with  respect  to  variations  in  these  parameters. 
We  shall  take  up  this  aspect  in  §  3.5.  (d). 

Of  these  parameters  k  is  determined  by  the  type  of 
heavy  atom  and  the  wavelength  of  X-rays  and  hence 
it  is  suitable  for  characterizing  the  influence  of  the 
wavelength  on  measurability.  of  is  determined  by 
the  number  of  P-atoms  and  0-atoms  and  their  scatter¬ 
ing  powers,  of  is  a  convenient  measure  of  the  relative 
domination  of  the  anomalous  scatterers  in  contri¬ 
buting  to  the  local  mean  intensity.  The  quantities  k 
and  of  are  not  present  as  parameters  of  the  pro¬ 
bability  distribution  of  the  variable  x  and  hence  it  is 
not  as  suitable  as  X  for  studying  the  measurability. 

3.5.  Measurability  study  in  space  group  PI  when  the 
Q-group  is  ideally  non-centrosymmetric 

Before  going  into  details  of  how  the  various  factors 
individually  influence  the  measurability  we  shall  study 
the  influence  of  the  number  of  anomalous  scatterers 
in  the  unit  cell  on  measurability,  the  relative  impor¬ 
tance  of  k  and  of  and  the  optimum  conditions  for  the 
measurability  for  crystals  of  space  group  PI  with  all 
atoms  occurring  at  random  positions.  That  is,  the 
0-group  is  assumed  to  satisfy  the  requirements  of  the 
acentric  Wilson  distribution  (Wilson,  1949). 

(a)  The  influence  of  the  number  of  anomalous 
scatterers.  The  CCFs  of  the  normalized  BD  x  are 
shown  in  Fig.2  for  the  cases  P=l,2,  MN  and  MC 
(Parthasarathy  &  Srinivasan,  1964).  It  is  seen  that  the 
one-atom  case  is  the  most  favourable  while  the  many- 
atom  (P  =  MC)  case  is  the  least  favourable  for  BD 
measurement,  the  other  conditions  such  as  k  and 
of  being  the  same.  The  two-atom  and  many-atom 
(P=MN)  cases  are  more  or  less  equally  effective 
and  fall  somewhat  between  the  one-atom  and  many- 
atom  (P  =  MC)  cases. 

Values  of  JVg(OT)  as  a  function  of  k  and  of  are 
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Fig.  2.  Complementary  cumulative  function  (in  %)  of  x  for 
the  one-atom,  two-atom  and  many-atom  ( P  =  MN  and 
P  =  MC)  cases. 


given  in  Table  1  for  the  cases  P=  1,  MN  and  MC 
(Parthasarathy  &  Parthasarathi,  1973).  It  is  seen  from 
Table  1  that  for  given  k  and  in  the  region  of  general 
interest  ( i.e .  a\  <  0-7) 

W l)h  >  W‘1)W>  (21) 

Thus,  for  the  given  values  of  k  and  o^,  the  measura¬ 
bility  is  the  largest  in  the  one-atom  case  and  the 
least  in  the  many-atom  ( P=MC )  case.  The  many- 
atom  ( P=MN )  case  falls  somewhat  between  these 
two.  This  is  in  agreement  with  the  conclusion  obtained 
from  a  study  of  CCF  of  x.  It  may  also  be  noted  that 
even  for  the  least  favourable  case  (i.e.  P=MC  case), 
if  k  and  cr^  are  not  very  small,  enough  percentage  of 
reflections  have  measurable  BDs.  For  example, 
corresponding  to  k— 0T  and  c^=0-3  about  43%  of 
reflections  have  S>0T.  Thus  structure  determination 
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Table  1.  (OT)  (in  %)  as  a  function  of  hand  a\ 
for  the  cases  P= 1,  MN  and  MC 


k 

P 

ffI 

0  lj 

0-3 

0-5 

0-7 

0-9 

0-06 

1 

23-0 

42-8 

43-3 

34-0 

10-5 

MN 

18-8 

32-7 

360 

32-7 

18-8 

MC 

16-2 

26-6 

29-5 

28-2 

20- 1 

01 

1 

46-1 

62-6 

61-4 

53-5 

290 

MN 

360 

52-1 

55-3 

52-1 

36-0 

MC 

30- 1 

43-4 

47-2 

46-4 

36-4 

0-2 

1 

73-9 

80-4 

79-4 

74-7 

58-1 

MN 

61-5 

73-7 

75-7 

73-7 

61-5 

MC 

51-7 

64-3 

68-1 

68-3 

60-6 

0-3 

1 

82-7 

86-8 

86-1 

82-8 

71T 

MN 

73-2 

82-1 

83-6 

82-1 

73-2 

MC 

62-9 

73-6 

76-9 

77-5 

72-0 

could  be  carried  out  even  for  the  case  P—MC  by  a 
proper  choice  of  k  and  a 

(b)  Relative  importance  of  k  and  a\.  In  order  to 
determine  the  relative  importance  of  k  and  o\  for  the 
measurability  we  shall  study  (for  a  fixed  P )  the 
variation  of  JV§(0T)  as  a  function  of  a\  (keeping  k 
constant)  and  as  a  function  of  A:  (keeping  a\  constant). 
Such  a  study  enables  us  to  decide  on  the  type  of  heavy 
atom  to  be  used  for  preparing  the  heavy-atom 
derivative  of  a  given  compound  and  on  the  proper 
X-radiation  to  be  employed  for  data  collection  in 
order  to  optimize  measurability. 

The  curves  of  7Vg(0T)  vs  of  for  different  fixed  values 
of  k  are  shown  in  Fig.  3  for  the  case  P=MN.  It  is 
seen  that  as  a\  increases  the  percentage  of  reflections 
for  which  8>0T  increases  and  attains  a  maximum  at 
a\  =  0-5,  thereafter  decreasing  to  zero  at  o|=T0. 
Thus,  for  a  given  k,  the  measurability  will  be  the 
largest  when  erf  is  close  to  0-5  and  least  when  it  is 
either  0  or  1.  Thus  a  P-group  whose  relative  domi- 
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Fig.  3.  Complementary  cumulative  function  (in  %)  of  8  as  a 
function  of  a*  for  different  fixed  values  of  k  for  the  many- 
atom  ( P  =  MN)  case.  The  number  on  each  curve  denotes 
the  value  of  k. 

nation  is  either  too  much  (ct|>  0  9,  say)  or  too  little 
(o^<0T,  say)  will  not  be  suitable  for  optimum  BD 
measurement. 

The  variation  of  jV|((M)  as  a  function  of  k  for 
different  fixed  values  of  of  is  shown  in  Fig.  4  for  the 
case  P=MN.  It  is  seen  that  the  percentage  of  reflec¬ 
tions  for  which  8>0-l  increases  systematically  as  k 
increases.  Thus,  for  a  given  a\  measurability  increases 
as  k  increases.  This  shows  that  for  realizing  the  full 
power  of  the  anomalous  scattering  method  in  struc¬ 
ture  analysis  the  choice  of  proper  wavelength  for 
data  collection  is  of  great  importance. 

(c)  Optimum  conditions.  From  the  study  of  the 
nature  of  iVg(O-l)  as  functions  of  k  and  a\  (see  §§  3-5 
(a),  (b))  we  obtain  the  following  as  optimum  conditions 
for  the  measurability  in  crystals  of  space  group  PI : 
C.  S.— 10 
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Fig.  4.  Complementary  cumulative  function  (in  %)  of  S  as  a 
function  of  k  for  different  fixed  values  of  a\  for  the  many- 
atom  ( P  =  MN)  case.  The  numbers  near  the  curves  denote 
the  values  of  cr\. 

(i)  k  should  be  as  large  as  possible*,  (ii)  <x[  should  be 
close  to  0-5  and  (iii)  the  number  of  anomalous  scat¬ 
tered  in  the  unit  cell  should  be  one.  It  may  be  noted 
here  that  conditions  (i)  and  (ii)  are  also  found  to  hold 
good  in  space  groups  of  higher  symmetry  (see  § 
3.5.  (;). 

(d)  Effect  of  data-tr uncation.  Suppose  yt  is  the 
threshold  value  of  the  normalized  structure  factor 
magnitude  for  the  data.  That  is,  the  reflections  for 
which  y'N  <  yt  are  taken  to  be  too  weak  to  be  observed. 

*The  measurability  of  BDs  in  the  case  of  Factor  V  la  (Dale 
et  al.  1963),  Methyl  melaleucate  iodoacetate  (Hall  &  Maslen, 

1965)  and  Davallol  iodoacetate  (Yow-Lam  Oh  &  Maslen 

1966)  were  so  large  as  to  enable  these  structures  determined 
from  the  BD  data  obtained  photographically.  The  good  mea¬ 
surabilities  in  the  case  of  these  structures  arise  due  to  the 
large  values  of  <k>  for  Co  and  I  atoms  with  CuKa  (see  §  3-5 
(f)  for  the  values  of  <k». 
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The  CCF  of  8  applicable  to  such  a  truncated  data 
(denoted  by  [Ag  (80)4)  has  been  derived  for  a  crystal 
of  space  group  PI  by  taking  the  £?-group  to  be  ideally 
NC.  The  results  for  three  cases,  namely,  P=l,  MN 
and  MC  have  been  derived  (Parthasarathy  &  Ponnu- 
swamy,  1981).  The  CCF  of  8  for  each  case  depends  on 
the  parameters  k,  of  and  yt.  [Ag  (80)]  denotes  the 
fractional  number  of  reflections  (in  a  given  range 
of  S )  for  which  8  >  80  among  those  for  which  y'N  ^  yt. 
However  for  discussing  the  effect  of  data  truncation 
on  the  measurability  the  appropriate  quantity  is  the 
fractional  number  of  reflections  for  which  8>80  and 
y'N  ^  yt  relative  to  the  population  consisting  of  all  the 
theoretically  possible  independent  reflections  in  the 
given  range  of  S.  We  shall  denote  this  fraction  by 
f  (S0)  and  this  is  related  to  the  CCF  of  8  for  the 
truncated  data  by 

4  («o)  =  N;n  (yt)  [Nc&  (80)]yt,  (22) 

where  N°  (yt)  is  the  value  of  the  CCF  of  y'N  at  y'N  = 

yt.  The  truncation  limit  yt  for  the  data  of  actual  crys¬ 
tals  would  generally  be  in  the  neighbourhood  of  0-2 
(Ponnuswamy,  1979).  The  BD  data  for  the  observed 
reflections  whose  intensities  are  close  to  this  truncation 
limit  may  not  be  very  accurate.  Hence  in  our  discus¬ 
sion  we  shall  assume  that  the  BD  data  corresponding 
to  reflections  for  which  y'N  >  0-3  are  sufficiently  accu¬ 
rate  to  yield  useful  results.  The  curves  of 4  (0T)  vs  <?l 
for  yt  =  0  and  0-3  corresponding  to  typical  values  of  k 
for  the  one-atom  case  are  shown  in  Fig.  5.  A  study  of 
these  curves  shows  that  data  truncation  due  to  un¬ 
observed  reflections  causes  only  a  small  decrease  in 
measurability.  For  a  typical  situation  in  which  k  is 
small  (k=0  01,  say — this  is  close  to  the  mean  value  of 
k  for  Cl  with  CuKa,  see  Parthasarathi  &  Partha¬ 
sarathy,  1974)  and  P—  1  and  —  0-3  about  47%  of 
the  reflections  are  expected  to  have  S  >  0- 1  when  the 
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Fig.  5.  4  (0  1)  X  100  [i.e.  the  percentage  of  reflections  for 
which  8  >  01  and  y'N  >  y,]  as  a  function  of  o;  for  different 
fixed  values  of  k  corresponding  to  the  untruncated  data 
(i.e.  yt  =  0)  and  to  the  data  truncated  at  y'N  =  0-3.  The 
broken  lines  are  for  the  untruncated  case  and  the  solid  lines 
are  for  the  truncated  case.  The  y-axis  is  shown  displaced  a 
little  to  the  left.  For  the  correct  position  of  this  axis,  shift  it 
to  the  right  by  parallel  displacement  such  that  it  passes 
through  the  point  af=0. 

data  is  truncated  at  yt  =  0-3  while  this  is  55%  for 
the  untruncated  data.  When  k  has  a  medium  value 
(k  =  0-18,  say — this  is  a  little  less  than  the  mean  value 
of  k  for  /  with  CuKa)  and  P=1  and  ct^=0-3  these 
numbers  for  the  truncated  and  untruncated  data  are 
70%  and  78%  respectively.  Thus  though  data  trunca¬ 
tion  causes  a  small  decrease  in  the  measurability, 
it  would  not  adversely  affect  the  same. 

(e)  Relevance  of  k  over  f".  It  is  important  to  note 
that  the  ratio  k  but  not  the  imaginary  part  f"  appears 
explicitly  as  a  parameter  of  the  CCF  of  X.  Thus  the 
measurability  will  be  determined  by  the  value  of  the 
ratio  k  but  not  by  the  absolute  value  of f"  alone.  This 
is  illustrated  by  two  examples  in  Table  2.  The  results 
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Table  2.  Examples  showing  the  relevance  of  k  over 
f"  for  the  measurability  of  BDs  of  a  crystal 


No. 

P-atom 

Q 

<vi> 

r 

</c> 

/o-aCO'l) 

1 

Cl 

50 

24-1 

0-702 

8-2 

45-8 

Br 

50 

62-3 

1-283 

6-5 

37-2 

2 

Co 

100 

23-0 

3-608 

31-3 

78-8 

I 

100 

67-0 

6-835 

22-2 

74-1 

Note:  The  values  of  (a?),  <k>  and  /„.3(0-l)  are  in  percent. 
Values  of/"  are  taken  from  Srinivasan  (1972).  <cr?>  and  (k) 
are  the  average  values  for  the  range  of  0  ^  sin  9/  A  ^ 
0-55A-1. 

in  Table  2  are  for  the  one-atom  case  with  CuKa. 
The  g-atoms  are  chosen  such  that  80  %  of  them  are 
C,  10%  are  N  and  10%  are  O.  It  is  seen  that  though 
f"  for  Br  with  CuKa  is  nearly  twice  that  for  Cl,  the 
value  of  (k)  for  Br  is  slightly  less  than  that  for  Cl. 
Thus  in  spite  of  f"  for  Br  being  twice  that  for  Cl  the 
measurability  for  the  Br  compound  is  less  than  that 
for  the  Cl  compound.  Similar  results  are  obtained  for 

the  second  example  in  Table  2. 

(/)  Choice  of  proper  wavelength  for  atoms  with  Z=10 
to  98  to  optimize  the  measurability.  We  have  seen  that 
the  ratio  k  plays  a  prominent  role  in  determining  the 
measurability.  We  shall  therefore  study  the  manner 
in  which  k  changes  with  atomic  number  Z  of  the  atom. 
This  incidentally  helps  us  to  understand  the  relative 
efficiency  of  two  of  the  radiations  generally  used  in 
crystal  structure  analysis,  namely,  CuKa  and  MoKa 
with  respect  to  structures  containing  various  types  of 
heavy  atoms.  Since  f0  decreases  while/'  and  f"  remain 
practically  constant  with  increasing  S,  k  will  in  general 
be  an  increasing  function  of  S.  For  obtaining  the 
theoretical  CCF  of  any  BD  variable  applicable  to  a 
given  crystal  the  average  value  of  k  ((k),  say) 
appropriate  to  the  situation  on  hand  must  be  used. 
Since  /'  and  f"  depend  on  the  wavelength  of  the 
radiation,  <fc>  will  also  differ  for  the  different  wave¬ 
lengths.  Values  of  (k)  for  reflections  upto  S=0-55 
( _ this  corresponds  to  a  2d  of  about  115  for 
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CuKa  and  45°  for  MoKa)  for  atoms  with  Z— 10  to  98 
are  shown  in  Fig.  6  for  both  CuKa  and  MoKa  radia¬ 
tions.  From  Fig.  6  it  is  seen  that  CuKa  is  better  suited 
than  MoKa  for  all  the  elements  except  those  for  which 
Z  ranges  from  28  to  39  and  68  to  86.  However  for  the 
elements  in  the  range  Z=68  to  86  the  superiority  of 
MoKa  over  CuKa  is  only  marginal.  For  CuKa 
radiation  Cr,  Mn,  Co,  I,  Ba,  Sm,  Gd,  Pt,  Au,  Hg  and 
Pb  are  some  of  the  suitable  elements  and  for  MoKa 
Zn,  Br,  Rb,  Pt,  Au,  Hg  and  Pb  are  suitable. 

(g)  Measurability  in  macromole cular  crystals.  The 
power  of  the  X-ray  anomalous  scattering  method  for 
structure  solution  of  macromolecular  crystals  (e.g. 
proteins)  has  been  pointed  out  by  Ramachandran  & 
Parthasarathy  (1965).  We  shall  now  discuss  measur¬ 
ability  in  such  complex  structures.  The  values  of  f0.3 


Fig.  6.  Average  value  of  k  (appropriate  to  the  range 
0  <  sin0/A  <C  0-55A-1)  as  a  function  of  the  atomic  number 
Z  for  CuKa  and  MoKa  radiations.  The  thick  lines  are  for 
CuKa  (S'max  =  0  55  A-1),  the  thin  lines  are  for  MoKa 
(‘S’max  =  0'55  A-1)  and  the  broken  lines  are  for  CuKa 
(Smax  =  0-25  A-1). 
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(0-1)  are  given  in  Table  3  for  a  number  of  typical 
situations.  The  results  in  Table  3  have  been  computed 
for  crystals  of  space  group  PI  with  one  anomalous 
scatterer  and  Q  light  atoms  per  asymmetric  unit.  The 
(2-atoms  are  chosen  such  that  80%  of  them  are 
carbons,  10%  are  nitrogens  and  10%  are  oxygens. 
The  values  of  <of  >  and  < k >  (for  CuKa)  correspond 
to  the  data  of  2  A  resolution.  It  is  seen  that  for  an 
asymmetric  unit  with  one  anomalous  scatterer  and 
2000  light  atoms  the  measurability  values  are  56  %  for 
Ba,  68%  for  Sm,  67%  for  Gd,  45%  for  Pt,  47%  for 
Au,  49%  for  Hg,  53%  for  Pb  and  67%  for  U.  Thus 
with  respect  to  measurability  Sm,  Gd,  and  U  are 
better  than  Au,  Pt,  Hg  and  Pb  and  this  mainly  arises 
due  to  the  larger  values  of  <fc>  for  the  former 
elements.  This  shows  that  for  realizing  the  full  power 
of  the  anomalous  scattering  method  the  choice  of 
proper  wavelength  is  important.  It  is  also  seen  from 
the  examples  in  Table  3  that  by  exploiting  the  ano¬ 
malous  scattering  in  an  optimum  way  (i.e.  by  choosing 
proper  heavy  atom  derivative  and  by  employing  a 
proper  wavelength  for  data  collection)  reasonably 
good  measurability  (more  than  50%,  say)  can  in 
general  be  obtained  for  proteins  with  even  a  few 
thousand  atoms. 

(h)  Measurability  in  crystals  of  moderate  complexity. 
We  shall  now  discuss  the  measurability  in  the  case  of 
crystals  of  moderate  complexity  (i.e.  crystals  contain¬ 
ing  a  few  hundred  atoms  per  asymmetric  unit)  by 
taking  a  few  examples.  The  values  of/0.3(OT)  are 
given  in  Table  4  for  a  few  typical  cases  and  these 
correspond  to  the  one-atom  case  and  pertain  to  data 
for  which  0<S<0-55A_1.  It  is  seen  from  Table  4 
that  by  using  heavy  atoms  such  as  Fe,  Co  and  I  with 
CuKa  one  can  obtain  measurability  as  high  as  70% 
even  in  structures  with  500  atoms  per  asymmetric  unit. 

It  is  also  seen  from  Table  4  that  the  measurability 
in  structures  containing  about  40  atoms  per  asym¬ 
metric  unit  is  nearly  40%  with  S  as  the  anomalous 
scatterer  and  47%  with  Cl  (both  with  CuKa).  Thus 
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Table  3.  Measurability  of  Bijvoet  differences  to  be 
expected  in  macromole cular  crystals  containing  one 
anomalous  scatterer  and  Q  normal  scatterers  per 
asymmetric  unit 
Atom 


<*>% 

Q 

<"!> 

/o-3  (0-1) 

z 

% 

% 

Ba 

500 

190 

700 

19-3 

1000 

10-5 

65-3 

56 

1500 

7-3 

60-7 

2000 

5-6 

56-1 

2500 

4-5 

51-9 

3000 

3-8 

47-9 

Sm 

500 

20-3 

76-3 

27-0 

1000 

11-3 

73-4 

62 

1500 

7-9 

70-6 

2000 

60 

68-0 

2500 

4-9 

65-5 

3000 

4-1 

63-1 

Gd 

500 

191 

76-2 

27-2 

1000 

10-6 

730 

64 

1500 

7-3 

70-1 

2000 

5-6 

67-4 

2500 

4-5 

64-7 

3000 

3-8 

62-2 

Pt 

500 

31-7 

58-8 

11-2 

1000 

18-9 

54-9 

78 

1500 

13-5 

49-8 

2000 

10-5 

44-9 

2500 

8-6 

40-4 

3000 

7-2 

36-4 

Au 

500 

32-5 

59-9 

11-6 

1000 

19-5 

56-4 

79 

1500 

13-9 

51-8 

2000 

10-8 

47-2 

2500 

8-9 

42-8 

3000 

7-5 

390 

Hg 

500 

33-1 

60-9 

120 

1000 

19-9 

57-8 

80 

1500 

14-3 

53-6 

2000 

111 

49-2 

2500 

91 

45-0 

3000 

7-7 

41-3 
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Pb 

500 

34-2 

63-2 

13-0 

1000 

20-8 

60-6 

82 

1500 

14-9 

57-1 

2000 

11-6 

53-4 

2500 

9-5 

49-7 

3000 

8-1 

46-2 

U 

500 

39-0 

71-5 

18-5 

1000 

24-3 

70-3 

92 

1500 

17-7 

68-6 

2000 

13-9 

66-9 

2500 

11-5 

65-0 

3000 

9-7 

63-2 

Table  4.  Measurability  of  Bijvoet  differences  to  be 

expected  in  crystals  of  small  molecules 
of  moderate  complexity. 

and  crystals 

Atom 

1  <*>  % 

Q 

<"?> 

% 

/o-3  (0-1) 

% 

P 

20 

39-3 

34-9 

5-6 

30 

30-2 

32-7 

50 

20-7 

26-6 

S 

20 

41-8 

43-0 

6-9 

30 

32-5 

42-0 

50 

22-4 

37-3 

80 

15-4 

30-1 

Cl 

20 

44-1 

49- 1 

8-2 

30 

34-6 

48-9 

50 

24-1 

45-8 

80 

16-6 

39-7 

100 

13-8 

35-8 

Fe 

50 

39-3 

77-6 

26-5 

100 

24-5 

76-7 

300 

9-8 

72-0 

500 

61 

67-7 

Co 

50 

37-3 

79-8 

31-3 

100 

23-0 

78-8 

300 

90 

74-5 

500 

5-6 

70-8 
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Br 

20 

80-4 

26-2 

6-5 

30 

73-3 

31-9 

50 

62-3 

37-2 

80 

51-0 

40-0 

100 

45-5 

40-7 

200 

29-6 

38-7 

I 

50 

801 

71T 

22-2 

100 

67-0 

74-1 

300 

40-7 

74-9 

500 

29-3 

74-3 

Br 

20 

81-0 

51-2 

12-3 

50 

63-3 

59-4 

(MoKa) 

100 

46-5 

61-6 

300 

22-7 

59-7 

500 

15-0 

55-2 

Cu 

30 

65-3 

43-4 

7-9 

50 

53-2 

46-7 

(MoKa) 

100 

36-3 

47-7 

200 

22-2 

43-1 

300 

160 

37-3 

structures  containing  a  S( or  Cl)  atom  and  50  other 
light  atoms  per  asymmetric  unit  can  in  principle  be 
tackled  by  the  anomalous  scattering  method. 

(i)  Measurability  in  light  atom  structures.  We  shall 
briefly  discuss  the  measurability  in  light  atom  structures 
since  this  is  important  with  respect  to  the  determi¬ 
nation  of  absolute  configuration  in  such  structures. 
The  values  of fyt(80)  corresponding  to  So=0-03,  0-05 
and  0T  and  yt= 0-2  and  0-3  are  given  in  Table  5  for 
the  case  P=MN  by  taking  the  value  of  (k)>  to  be 
0-011  which  corresponds  to  the  mean  value  k  for 
oxygen  with  CuKa  radiation  in  the  range 
A-1.  The  results  in  this  table  have  been  computed  on 
the  assumption  that  k  of  C  for  CuKa  is  negligible 
compared  to  that  of  O  and  these  results  may  be 
applied  to  light  atom  structures  containing  C  and  O. 
A  study  of  this  table  shows  that  a  dozen  reflections  or 
more  can  in  general  be  found  for  which  S>0-05  and 
j^>0-3.  Thus  in  spite  of  the  data  truncation  due  to 
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unobserved  reflections  BDs  can  be  measured  for  a 
good  number  of  reflections  for  establishing  the 
absolute  configuration  (by  the  Bijvoet  method)  of 
an  NC  structure  containing  only  light  atoms. 

0)  Effect  of  space  group  symmetry.  In  crystals  with 
heavy-atoms  the  probability  distribution  of  intensities 
depends  on  the  presence  of  space-group  symmetry 
elements  other  than  the  centre  of  symmetry  (Karle  & 
Hauptman,  1953;  Hauptman  &  Karle,  1953).  Foster 
&  Hargreaves  (1963u,  b )  have  shown  that  space 
groups  of  the  triclinic,  monoclinic  and  orthorhombic 
systems  can  be  classified  into  7  categories*  (called 
1,2, 3,..., 7)  based  on  the  form  of  the  trigonometric 
factors  of  the  geometrical  structure  factor.  Of  these 
only  the  categories  1,3,5  and  6  belong  to  the  NC 
case  and  hence  here  we  need  consider  only  these 
(see  Foster  &  Hargreaves,  19636).  Though  the 
probability  distribution  of  X  taking  into  account  the 
space-group  symmetry  is  not  available,  the  expecta¬ 
tion  value  of  X  has  been  calculated  for  crystals 

Table  5.  Values  of  fyt  (S0)  (in  %)  as  a  function  of  o\ 
corresponding  to  yt  =  0-2  and  0-3  and  S0  —  0-03, 
0-05  and  0T  for  the  many-atom  (P—MN)  case  when 
£=0-011. 


yt 

l°r 

0-1 

0-2 

0-3 

0-4 

0-5 

0-2 

0-03 

5-8 

10-8 

13-9 

15-7 

16-2 

0-05 

1-4 

3-3 

4-7 

5-6 

5-8 

0-10 

0-0 

0-3 

0-5 

0-7 

0-7 

0-3 

0-03 

4-2 

8-3 

111 

12-8 

13-3 

0-05 

0-5 

1-9 

3-0 

3-6 

3-9 

010 

0-0 

0-0 

0-1 

0-2 

0-2 

*The  results  of  this  section  can  in  principle  be  extended  to 
space  groups  of  higher  symmetry  by  making  use  of  the  results 
of  Wilson  (1978),  Shmueli  &  Wilson  (1981)  and  Shmueli  & 
Kaldor  (1981)  and  this  is  in  progress. 
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(of  categories  1,  3,  5  and  6)  containing  1  and  2 
heavy  atoms  (all  of  one  type)  per  asymmetric  unit 
(Velmurugan  &  Parthasarathy,  1981).  The  values  of 
(  X)  as  a  function  of  <j\  for  the  cases p—  1  and  2  corres¬ 
ponding  to  /<=0T  and  0-3  are  shown  in  Fig.  7.  The 
values  of  <(2f)  as  a  function  of  k  for  af  =  0T  and 
0-5  are  shown  in  Fig.  8.  From  these  figures  we  obtain 
the  following  results:  (i)  Among  the  very  commonly 
occurring  case  of  crystals  containing  one  anomalous 
scatterer  per  asymmetric  unit  (i.e.  p=  1  case)  in  the 
region  of  general  interest  (i.e.  o^<0-7)  the  triclinic 
space  group  PI  is  the  most  favourable  while  the 
orthorhombic  crystal  of  category  6  is  the  least  favour¬ 
able  (the  other  conditions  such  as  the  complexity  of 
the  asymmetric  unit,  the  type  of  heavy  atom  and  the 
wavelength  used  being  the  same).  The  categories  3 


s« 

X 


Fig.  7.  Expectation  value  of  X  (in  %)  as  a  function  of  a\ 
(corresponding  to  £  =  0-1  and  0-3)  for  the  non-centro- 
symmetric  space  group  categories  1, 3,  5  and  6  of  the  triclinic, 
monoclinic  and  orthorhombic  systems  containing  p(=  1  or 
2)  anomalous  scatterers  per  asymmetric  unit. 
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Fig.  8.  Expectation  value  of  X  (in  %)  as  a  function  of  k  (corres¬ 
ponding  to  a\  =  0-1  and  0-5)  for  the  space  group  categories 
1,  3,  5  and  6  containing  p{=  1  or  2)  anomalous  scatterers 
per  asymmetric  unit. 

and  5  are  more  or  less  equally  effective  and  fall  some¬ 
what  between  the  categories  1  and  6.  (ii)  For  given  k 
and  as  the  number  of  anomalous  scatterers  per 
asymmetric  unit  increases  from  1  to  2  the  measur¬ 
ability  decreases  in  category  1,  increases  somewhat  in 
category  6  and  remains  practically  unaffected  in 
categories  3  and  5  and  consequently  the  distinction 
between  the  various  cases  becomes  less  marked.  For 
3  the  measurability  is  practically  unaffected  by 
space-group  symmetry.  These  results  are  in  agreement 
with  those  obtained  by  Parthasarathy  &  Ponnuswamy 
(1976)  from  a  study  of  the  expectation  value  of  x. 
Incidentally  it  may  be  noted  from  these  figures  that 
the  optimum  conditions  (i)  and  (ii)  for  measurability 
deduced  for  space-group  PI  (see  §3.5.(c))  are  valid 
for  space  groups  of  higher  symmetry. 
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3.6.  Measurability  in  the  presence  of  degree  of  centro- 
symmetry 

So  far  we  have  considered  the  0-group  to  be  ideally 
NC  in  configuration.  The  0-groups  of  actual  crystals 
may  exhibit  different  types  of  DCS.  We  shall  discuss 
the  measurability  for  the  following  three  situations 
in  crystals  of  space  group  PI  with  one  anomalous  and 
0  similar  normal  scatterers  in  the  unit  cell :  Situation 
(i) :  The  0-group  consists  of  an  ideally  C  part  contain¬ 
ing  0C  atoms  besides  an  ideally  NC  part  containing 
0„  atoms  (i.e.Q=Qc+Qn).  We  shall  take  the  single 
P-atom  to  lie  at  a  point  which  is  significantly  different 
from  the  centre  of  the  Qc  group.  Situation  (ii):  This  is 
similar  to  situation  (i)  except  that  the  single  P-atom 
is  now  taken  to  lie  exactly  at  the  centre  of  the  0C- 
group.  For  convenience  we  shall  refer  to  the  DCS  met 
with  in  (i)  as  molecular  DCS  and  that  in  (ii)  as  Type 
II  DCS  of  the  crystal  (Parthasarathy  &  Parthasarathi, 
1976a).  Situation  (iii):  The  0-group  consists  of  0/2 
atoms  at  r,  (j  =  1  to  0/2)  and  the  other  0 1 2  atoms  at 
—  Tj  +  Ar,  (J  =  1  to  0/2),  where  Ar,  (j  —  1  to  0/2 
are  0/2  mutually  independent  3-dimensional  Gaussian 
vectors  independent  of  the  r/s.  The  single  P-atom  is 
taken  to  lie  at  the  centroid  of  the  0c-group.  We  shall 
refer  to  this  situation  as  Type  I  DCS  of  the  crystal. 
We  shall  discuss  the  measurability  for  these  three 
situations  in  §§3.6  (a),  ( b ),  &  (c). 

(a)  Molecular  degree  of  centr o symmetry .  The  DCS 
of  the  molecule  (i.e.  the  0-group)  can  be  characterized 
by  the  quantity  r  defined  by 

r  =  <  I  Foc  I  “>  Kl  Fels>> 

which  for  a  0-group  made  up  of  similar  atoms  can  be 
written  as 

r=Qc\  Q-  (23) 

r  is  thus  the  ratio  of  the  number  of  atoms  in  the  C  part 
of  the  0-group  to  the  total  number  of  atoms  in  the 
0-group.  r  is  0  when  the  0-group  is  ideally  NC  and 
1  when  it  is  completely  C  and  it  takes  intermediate 
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values  corresponding  to  different  DCS.  The  CCF  of 
A  depends  on  the  parameters  k,  o\  and  r  (Partha- 
sarathy  &  Parthasarathi,  1974; — see  also  Swaminathan 
&  Srinivasan,  1974  for  the  limiting  case  r  —  1).  The 
curves  of  A^(0.05)  vs  r  are  shown  in  Fig.  9  for  different 


Fig.  9.  NC£  (0-05)  (in  %)  as  a  function  of  r  (/.<?.  the  degree  of 
centrosymmetry  of  the  molecule)  for  different  fixed  values 
of  cl.  (a),  ( b )  and  (c)  correspond  to  k  =  0-1,  0-2  and  0-3 
respectively. 
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fixed  values  of  corresponding  to  k=0-l,  0-2  and 
0-3.  These  curves  are  nearly  flat  over  a  wide  range  of 
r  (/-<0-5,  say)  showing  thereby  that  the  measurability 
would  be  practically  unaffected  even  if  50%  of  the 
atoms  in  the  molecule  form  a  single  C  group.  It  is 
interesting  to  note  that  even  if  the  whole  molecule  is 
completely  C  (now  the  combined  P-  and  £)-groups 
have  an  overall  NC  configuration),  in  spite  of  the 
decrease  in  the  value  of  A^(0-05),  there  still  exists 
enough  percentage  of  reflections  with  measurable  BD. 
For  example,  when  ctj=0-4  and  k=0-l  we  find  Nc^ 
(0-05)  to  be  72%  for  r— 0  and  60%  for  r—  1.  Thus  in 
actual  crystals  the  molecular  DCS  would  pose  no 
special  problem  with  respect  to  measurability. 

(b)  Crystal  with  type  I  degree  of  centro symmetry. 
A  convenient  measure  of  the  Type  I  DCS  of  the  crystal 
is  (  |  Ar  |  )q.  The  CCF  of  A  depends  on  the  para¬ 
meters  k,  c and  Dq  (Parthasarathy  &  Parthasarathi, 
19766).  where  Dq  is  defined  as  * 


Dq  —  (cos  2t t  H-Ar)g  =  exp 


The  curves  of  NL ^  (0-05)  vs  <(|  Ar ( }q  for  the  cases 
k— 0T,  0-2  and  0-3  (for  5'=0-35  A-1)  are  shown  in 
Fig.  10.  It  is  seen  that  when  <(|  Ar  j  >g=0T  A,  fc=0T, 
S'=0-35A_1  and  crJ=0-5  only  7%  of  the  reflections 
have  a  A  >0  05.  Thus  in  crystals  with  a  high  Type 
I  DCS  [(  |  Ar|  <  0T  A,  say]  the  measurability  is 
too  low  to  be  of  use  for  structure  determination. 
However,  the  breakdown  of  Friedel’s  Law  can  be 
detected  if  /c  is  sufficiently  large.  This  point  is  relevant 
to  space-group  determination  using  anomalous  scat¬ 
tering  in  such  crystals  (Srinivasan  &  Vijayalakshmi, 
1972). 

(c)  Crystal  with  type  II  degree  of  centrosymmetry. 
A  convenient  measure  of  the  Type  II  DCS  of  the 
crystal  is  the  quantity  QJQ  (=r,  say).  The  CCF  of 
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<|A?|>q(&) 

Fig.  10.  Nc^  (0  05)  (in  %)  as  a  function  of  <  |  Ar  |  >G  for 
different  fixed  values  of  of  in  crystals  with  Type  I  degree  of 
centrosymmetry.  (a),  (b)  and  (c)  correspond  to  k  =  01,  0-2 
and  0-3  respectively.  These  curves  are  for  sin  0/A  =  0-35A-1. 

A  depends  on  the  parameters  k,  and  r  (Partha- 
sarathy  &  Parthasarathi,  19766).  Curves  of  Nc^  (0-05) 
vs  r  corresponding  to  k= 0-1,  0-2  and  0-3  for  different 
o\  are  shown  in  Fig.  1 1.  It  is  seen  that  for  medium  and 
C.  S.— 11 
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Fig.  1 1 .  Nc^  (0-05)  (in  %)  as  a  function  of  r  ( i.e .  the  Type  II 
degree  of  centrosymmetry  of  the  crystal)  for  different  fixed 
values  of  orj.  (a),  ( b )  and  (c)  correspond  to  k  —  0-1,  0-2  and 
0-3  respectively. 

large  values  of  A:  (k  >  0-15,  say)  the  curves  are 
practically  flat  for  r  <  0-5.  However  for  small  k  (e.g. 
A:=0-05)  the  measurability  decreases  more  or  less 
systematically  with  increasing  r.  It  is  also  seen  that 
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r 


Fig.  12.  Comparison  of  (0  05)  (in  %)  versus  r  for  the  two 
situations,  namely,  (i)  the  case  of  molecular  DCS  (dotted 
line)  and  (ii)  crystal  with  Type  II.  DCS  (solid  line).  The 
numbers  near  the  curves  denote  the  values  of  of.  (a),  (b)  and 
(c)  correspond  to  k  =  0-1,  0-2  and  0-3  respectively. 

for  k= 0-2,  <^=0-5  and  r— 09,  N £  (0-05)  is  55%. 
Thus  crystals  with  Type  II  DCS  as  high  as  r= 0  9  have 
enough  percentage  reflections  with  measurable  BD 
for  structure  determination  provided  k  is  sufficiently 
large  (£>0T5,  say). 
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A  comparison  of  the  results  for  a  crystal  with  Type 
II  DCS  (§  3.6(c))  and  a  crystal  with  molecular  DCS 
(§3. 6(a))  is  interesting.  From  Fig.  12  it  is  seen  that 
for  given  k  and  r  the  measurability  in  the  former  case 
is  less  than  that  in  the  latter.  Further  when  r  tends  to 
1  while  the  measurability  in  the  former  case  is  too  low 
that  in  the  latter  case  remains  quite  high  even  at 
#•=  1. 


4.  Measurability  of  Bijvoet  difference  of  a  reflection 

The  Bijvoet  method  (Bijvoet  et  al.  1951)  of  determining 
the  absolute  configuration  in  NC  crystals  consists 
of  the  following  steps:  (i)  Completely  determine  and 
refine  the  crystal  structure  in  a  given  configuration 
using  the  h  k  1- data,  (ii)  Calculate  the  BR  for  all  the 
reflections  from  the  known  atomic  parameters  and 
choose  the  top  few  (a  dozen,  say)  reflections  as 
optimum,  (iii)  Measure  the  BRs  for  these  optimum 
reflections  and  compare  these  with  the  corresponding 
calculated  values  to  arrive  at  the  correct  configuration. 
Thus  it  is  necessary  to  keep  the  crystal  until  the 
structure  is  completely  refined  in  order  to  collect  the 
BR  data  for  the  few  optimum  reflections.  We  shall 
describe  a  statistical  method  which  circumvents  this 
necessity  to  some  extent.  This  method  is  particularly 
useful  when  the  anomalous  scattering  effect  of  the 
heavy  atom  compound  is  not  pronounced*  (e.g. 
organic  molecules  containing  S,  P  or  Cl).  This  method 
requires  a  knowledge  of  the  intensity  data  for  the  hkl- 
reflections  and  the  magnitude  |  F'p\  of  the  heavy  atoms. 
Since  the  heavy  atoms  may  be  located  from  the  Patter¬ 
son,  computed  using  the  h  k  /-intensity  data,  ( F'p  |  may 
be  taken  to  be  known.  Thus  the  present  method  can 

*lf  the  anomalous  scattering  effect  is  quite  pronounced, 
enough  number  of  reflections  showing  large  BDs  can  be 
recognized  even  in  the  X-ray  photographs.  Hence  the  few 
reflections  required  can  be  chosen  from  a  visual  estimation  of 
photographs  during  the  stage  of  data  collection. 
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be  used  to  select  the  optimum  reflections  immediately 
after  the  heavy  atoms  have  been  located. 

(a)  Definition  of  the  measurability.  A  probability 
measure  .that  is  suitable  for  expressing  the  measura¬ 
bility  of  BD  of  a  reflection  for  the  present  situation 
is  the  conditional  CCF  of  X  for  given  |  F'N  |  and  |  F'p  |. 
That  is 


JV£(X;  |  rN\. \Fp\)  =  P,  (A>X„;  I  F'n  j,  (25) 

where  X0  is  a  particular  value  of  X.  The  reflection  for 
which  the  probability  value  V£(0-1 ;  |  |Fj>|)  is 

high  (greater  than  0-75,  say)  may  be  chosen  as  suitable 
for  BD  measurement.  Velmurugan,  Parthasarathy  & 
Parthasarathi  (1979)  have  shown  that 


|-r.C05h(; 
J  0  (a: 


(a2- V2)1/2j 


^2)1/2 

where  a  and  ft  are  defined  to  be 


dx-A^(y0;  a,  P),  say 
(26) 


l^l*+**l^l*  <IV> 

Since  a  and  ft  are  single  valued  functions  of  and 
|  F'p  |  (26)  can  be  taken  to  depend  on  the  parameters 
a  and  p  instead  of  \Fk\  and  |  F'p  j .  Values  of  k  and 
(  |  Fq  j2)  for  a  given  reflection  can  be  readily  found 
from  a  knowledge  of  the  unit  cell  content  and  unit 
cell  parameters.  Since  the  P-atoms  are  known,  |  F'p  \ 
will  be  known.  When  the  anomalous  scattering  is  not 
pronounced  |  F'N  |  may  be  equated  to  Fobs  (hkl).  For 
a  given  X0  (0*1,  say)  the  probability  value  V^(0-1; 
a,  P)  applicable  to  the  reflection  may  be  evaluated 
numerically  from  (26).  The  top  few  reflections  having 
high  probability  values  may  be  taken  to  be  optimum 
for  BD  measurement. 


1 66  S.  Parthasarathy 


A  simple  procedure  for  implementing  (26)  can  be 
devised  by  studying  the  properties  of  the  equation 

Ncx(X0;a,P)=p.  (28) 

For  given  values  of  X0(=OT,  say)  and  p,  (28)  represents 
a  curve  in  the  a,  /?— plane.  The  curves  corresponding 
to  p=0-75,  0-8. . .  .are  shown  in  Fig.  13  for  the  case 
X=0-l.  For  given  X0  and  p,  the  curve  starts  from  a 
point  (amjn,  0)  on  the  a-axis  and  increases  thereafter  as 
a  increases.  Thus,  for  given  X0  and  p,  relation  (28) 
is  not  possible  if  a  <  amin.  This  means  that  given  the 
values  of  X0  and  p  the  reflections  for  which  a<  amin 
cannot  show  a  BR>Ar0  with  a  probability  value  p  or 
more.  This  property  may  be  used  to  eliminate  reflec¬ 
tions  which  are  not  suited  for  BD  measurement.  For 
any  a  (>  amjn)  there  is  a  unique  /S  (=  ft,  say)  for 
which  (28)  is  satisfied.  This  ft  can  be  obtained  by 
solving  (28)  numerically. 


the  equation  Ncx  (0-1 ;  o,  ft  =  p  corresponding  to  p  =  0-75, 
0-8 . 0-95. 
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(b)  Implementation  of  the  theoretical  results.  The 
following  procedure  may  be  used  to  select  the  few 
optimum  reflections:  (i)  Locate  the  heavy  atoms 
(from  the  Patterson,  say)  using  the  intensity  data  of 
/^/-reflections.  (ii)  Calculate  the  value  of  a  from  (27) 
for  each  reflection  using  the  known  values  of  Fobs  ( hkl ) 
and  | F'p\.  Reject  the  reflections  for  which  a<0-26 
i.e.  for  which  Pr  (X>0-1;  a,  ft)<0-15.  Let  N±  be  the 
number  of  remaining  reflections,  (iii)  Find  the  values 
of  for  these  Nx  reflections  from  their  values  of  a 
by  linear  interpolation  from  the  results  in  Table  6. 
Calculate  the  values  of  ft  from  (27)  for  these  Nt 
reflections.  Reject  the  reflections  for  which  ft>ftt- 
Let  N%  be  the  remaining  number  of  reflections,  (iv) 
Calculate  the  probability  values  Ncx  (0-1;  a,  ft)  for 
these  N2  reflections  by  bilinear  interpolation  using 
the  results  in  Table  7.  Order  these  N2  reflections  with 
decreasing  probability  values  and  choose  the  top 
dozen  reflections  as  optimum  for  BD  measurement. 

The  results  of  application  of  (26)  in  the  case  of  a 
few  actual  crystals  are  shown  in  Table  8.  The  proba¬ 
bility  values  Ncx  (0-1;  a,  ft)  are  calculated  for  all  the 
reflections  directly  from  (26).  The  reflections  are  then 
arranged  in  the  decreasing  order  of  probability.  The 
top  ten  reflections  with  the  highest  probability  values 
are  given.  The  last  column  contains  the  corresponding 

Table  6.  ft  as  a  function  of  a  satisfying  the  equation 
Ncx  (0-1 ;  a,  ft)  =  0-75 


a 

P 

a 

P 

a 

P 

a 

P 

0-26 

000 

0-70 

5-24 

1-15 

13-72 

1-60 

26-33 

0-30 

0-97 

0-75 

5-98 

1-20 

14-91 

1-65 

27-98 

0-35 

1-40 

0-80 

6-77 

1-25 

16-16 

1-70 

29-69 

0-40 

1-86 

0-85 

7-61 

1-30 

17-46 

1-75 

31-45 

0-45 

2-32 

0-90 

8-50 

1-35 

18-81 

1-80 

33-26 

0-50 

2-82 

0-95 

9-44 

1-40 

20-21 

1-85 

35-12 

0-55 

3-35 

100 

10-43 

1-45 

21-66 

1-90 

37-03 

0-60 

3-93 

1-05 

11-47 

1-50 

23-17 

1-95 

39-00 

0-65 

4-56 

110 

12-57 

1-55 

24-72 

2-00 

41-01 

Table  7.  Conditional  complementary  cumulative  function  Ncx( 0-1 ;  a,  p)  x  1000  as  a  function  of  a  and  /S 
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Table  8.  Top  ten  reflections  with  the  highest  probability  values  Ncx  (0-1 ;  |F^[,  \Fp\)  and  the  corresponding 

observed  values  for  the  Bijvoet  ratio  for  some  actual  crystal. 
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observed  values  of  the  BR.  For  each  crystal  the 
reflection  for  which  the  prediction  is  wrong  is  shown 
with  an  asterisk  (*).  The  present  method  is  seen  to 
be  successful  in  73  %  of  cases  on  the  average. 

The  result  in  (26)  can  also  be  used  to  select  the 
reflections  for  BD  measurement  for  the  purpose  of 
structure  determination.  For  details  one  may  refer 
to  the  original  paper. 


5.  Conclusions 

The  main  results  on  the  measurability  of  BDs  of  a 
crystal  may  be  summarized  as  follows:  (a)  The 
optimum  conditions  for  the  measurability  of  BDs  of 
a  crystal  are:  (i)  k  should  be  as  large  as  possible  and 
(ii)  of  should  be  in  the  neighbourhood  of  0-5.  (b) 
Measurability  depends  on  the  ratio  k  rather  than  on 
/".  (c)  Measurability  is  influenced  more  by  k  than  by 
of.  A  proper  choice  of  wavelength  is  therefore  of 
great  importance  to  realize  the  full  power  of  ano¬ 
malous  scattering,  (d)  Data  truncation  due  to 
unobserved  reflections  causes  only  a  small  decrease 
in  the  measurability.  Hence  even  in  complex  non- 
centrosymmetric  structures  containing  a  few  thousand 
atoms  measurability  of  the  order  of  50%  or  more 
may  be  obtained  by  exploiting  the  anomalous  scatter¬ 
ing  phenomenon  in  an  optimum  way.  (e)  In  crystals 
with  one  anomalous  scatterer  per  asymmetric  unit 
the  triclinic  category  1  is  the  most  favourable  while 
the  orthorhombic  category  6  is  the  least  favourable 
(the  other  conditions  such  as  the  complexity  of  the 
asymmetric  unit,  the  type  of  heavy  atom  and  the 
wavelength  used  being  the  same).  The  distinction 
between  the  space-group  categories  becomes  lesser 
and  gets  evened  out  as  the  number  of  anomalous 
scatterers  per  asymmetric  unit  increases,  (f)  The 
measurability  is  practically  unaffected  even  if  50% 
of  the  atoms  in  a  molecule  form  a  single  centro- 
symmetric  group,  (g)  In  crystals  with  a  high  Type  I 
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degree  of  centrosymmetry  the  measurability  is  too 
low  to  be  of  use  for  structure  determination. 
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Intensity  Statistics  and  Non-Independence 

By  A.  J.  C.  Wilson 
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Abstract 

The  usual  expressions  for  the  probability  distribution 
of  X-ray  reflexions  are  derived  on  the  assumption  that 
the  contributions  of  individual  atoms  to  the  structure 
factor  are  independent.  In  reality  the  finite  size  and 
stereochemical  properties  of  atoms  prevent  complete 
independence.  Central-limit  theorems  still  apply,  but 
are  there  valid  expansions  of  the  Gram-Charlier  or 
Edgeworth  type? 


Statistics  and  non-independence 

While  the  statisticians  are  still  here  I  should  like  to 
raise  a  question  about  the  incorporation  of  stereo¬ 
chemical  considerations  into  the  mathematical  expres¬ 
sions  for  the  probability  distributions  of  X-ray 
reflexions  when  the  number  of  atoms  is  too  small  for 
the  central-limit  argument  (Wilson,  1949)  to  be 
applied.  If  the  contributions  of  individual  atoms 
to  the  expressions  for  the  structure  factor, 

n 

Fhki  =  2  fJ  exp  {2vr/  (hXj  +  kyJ  +  /zj)}’  0) 

7  =  1 

are  assumed  to  be  independent,  then  the  probability 
distribution  can  be  expanded  as  a  series  involving 
orthogonal  polynomials;  a  paper  by  Shmueli  & 
Wilson  (1981)  may  be  consulted  for  a  detailed  discus¬ 
sion  and  references.  In  reality,  of  course,  the  finite 
sizes  and  the  stereochemical  properties  of  atoms 
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prevent  complete  independence.  There  are  central- 
limit  theorems  valid  for  ‘almost  independent’  variables 
(Bernstein,  1922')  and  for  variables  dependent  only  on 
finite  number /(«)  of  their  neighbours  (Bernstein,  1927). 
This  second  case  seems  plausible  for  the  crystallo¬ 
graphic  application,  as  the  positions  of  neighbouring 
atoms  are  closely  correlated,  but  most  molecules  have 
enough  flexibility  to  prevent  appreciable  corre¬ 
lation  between  widely  separated  atoms.  Marker 
(1953)  suggested  that  protein  molecules  could  be 
regarded  as  consisting  of  a  number  of  ‘globs’,  with 
atomic  positions  within  a  ‘glob’  being  correlated,  but 
without  correlation  between  ‘globs’.  This  seems 
analogous  to  f(n)  dependence. 

If  the  generalized  central-limit  theorem  is  applied  to 
the  case  of  non-independent  atomic  contributions,  the 
functional  forms  of  the  distributions  deduced  by 
Wilson  (1949)  are  retained  but  the  distribution 
parameter  is  increased  from 

n 

(2) 

7  =  1 

(Wilson,  1942)  to  <7),  the  actual  local  average  of 
the  intensity  of  reflexion.  This  was  tacitly  assumed  by 
many  authors,  and  French  &  Wilson  (1978)  made  it 
an  explicit  empirical  postulate;  their  postulate  for  the 
value  of  the  parameter  has  been  justified  theoretically 
(Wilson,  1981).  In  the  independent  case  the  distri¬ 
bution  function  for  a  finite  number  of  atoms  can  be 
expressed  as  the  sum  of  the  ideal  distribution  (the 
central  limit)  and  a  series  of  correction  terms,  each 
correction  term  being  the  product  of  three  factors: 

(i)  the  ideal  distribution; 

(ii)  one  of  a  set  of  orthogonal  polynomials;  and 

(iii)  a  function  of  the  moments  of  the  distribution. 

In  general  such  series,  considered  as  functions  of  the 
number  n  of  terms  summed,  are  only  asymptotic,  but 
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intensity  distributions  appear  to  fulfil  the  conditions 
for  genuine  convergence.  My  question  is:  Are  there 
analogous  series  for  the  sum  of  n  non-independent 
random  variables?  One  might  guess  that  factors  (i) 
and  (ii)  would  remain,  but  the  functions  of  the 
moments  in  (iii)  would  be  altered  in  a  manner  analo¬ 
gous  to  the  substitution  of  <(/>  for  2  as  the  distribu¬ 
tion  parameter  in  the  central-limit  expresssion. 
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Introduction 

In  many  publications  the  distribution  functions  of 
intensities  influenced  by  random  (counting)  errors 
have  been  discussed  and  estimates  of  standard  devia¬ 
tions  in  measured  intensities  have  been  worked  out 
(Wilson,  1978,  1980,  and  references  therein).  For  the 
experimentalist  in  X-ray  diffraction,  this  field  of 
research  is  like  an  utopia,  almost  unrealistic  as  he  can 
aim  for  but  hardly  reach  it  by  his  experimental  data. 
The  present  paper  will  show  that  especially  strong 
reflections  find  obstacles  on  their  way.  In  addition,  we 
will  see  that  for  weak  reflections,  background  can  be 
a  real  bottleneck. 

As  an  example  accurate  measurements  for  electron- 
density  studies  will  be  discussed.  One  of  the  objects  of 
this  field  of  research  is  to  map  deformation  properties, 
such  as  the  deformation  density 

Mr)  =  p 2  (F0(H)-FC(H)}  exp  (~2m  Hr),  (1) 
or  the  deformation  potential 

Mr)  =  t{Fo(H)  —  FC(H)}  exp  (—2m  H-r)]/ 

(sin  0/A)2.  (2) 

The  formulae  are  for  a  centrosymmetric  crystal ;  FC(H) 
is  based  on  a  reference  model  of  spherically  symmetric 
atoms,  which  we  assume  to  be  known.  The  measure¬ 
ments  have  been  set  up  with  care.  For  the  present 
discussion  we  consider  a  data  set  in  which  ‘intrinsic’ 
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systematic  errors  in  the  intensities,  like  extinction, 
absorption,  TDS  and  multiple  diffraction  have  been 
eliminated.  For  the  diffractometer  work  the  following 
conditions  were  fulfilled  after  careful  checks:  homo¬ 
geneous  monochromatic  X-ray  beam,  uniform 
response  of  counter  surface,  correct  alignment  of 
diffractometer,  counter  width  and  scan  range  chosen 
such  that  full  integrated  intensities  could  be  measured. 
The  question  arises  whether  under  these  carefully 
chosen  measuring  conditions  the  errors  in  the 
intensities  are  due  to  counting  statistics  alone.  With 
the  assumption  that  the  answer  will  be  ‘yes’  standard 
deviations  for  very  strong  and  very  weak  reflections 
will  be  estimated  on  the  basis  of  approximate  formulae 
which,  however,  are  sufficient  to  obtain  an  insight  in 
the  essential  features  of  these  quantities. 


Estimates  of  standard  deviations 

The  intensity  7(H ;  net)  of  a  reflection  H  is  given  by 

/(H ;  net)  =  [*//( H)]  [7P(H)  -  m(H)  7ft(H)] ;  (3) 

k  is  a  scale  factor  which  is  equal  for  all  reflections  and 
which  obeys  the  relation  k  ~  I~x  v~x  where  70  is  the 
intensity  of  the  primary  beam  and  v  is  the  volume 
of  the  crystal ;  t(H)  =  time  spent  on  the  ‘peak’  of 
reflection  H;  m(H)  =  t(H;  peak)//(H;  background). 
7P(H)  and  76(H)  are  the  number  of  counts  measured 
for  peak  and  background  respectively.  The  variance 
a2  [7(H,  net)]  is  given  by 

^  [7(H;  net)]  =  [k2/t2( H)]  [7,(H) 

+  m2(H)7b(H)].  (4) 

The  [  F0(H)  |2  value 

|F0(H)|2  =  (Lp)7(H;  net),  (5) 

has  the  variance 

[|F»(H)|2J  =  [^(Lp)!/<2(H)]  [/„(H) 

+  m‘(H)  4(H)],  (6) 
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(a)  Very  strong  reflections 

These  reflections  obey  the  assumption  m(H)  76(H)  < 
/P(H)  so  that 

/(H;  net)  ~  k  7p(H)/t(H),  (7) 

and 

<r>  [|  F0( H)  n  =  [^(Lp)2/t(H)]  /(H;  net).  (8) 

With  <t  (|  T|)  =  a  ([i7|2)/2  |  F|  we  obtain  from  (8) 
and  (5) 

(|  F0(H)  |)  =--  **(Lp)*/2/*(H)  =  **/2/*(H),  (9) 

with  <„(H)  =  f(H)/(Lp) 

Conclusion:  For  this  category  of  reflections  a(|F0(H)|) 
does  not  depend  on  7(H;  net)! 

(b)  Very  weak  reflections 

Rees  (1976)  has  shown  that  for  reflections  with 
|F0(H)|2  <  a  [|F0(H)|2]  a  good  estimate  for  the 
standard  deviation  is  given  by 

»'[|7-o(W)|J=ot[|7’»(«)li!J/2.  (10) 

For  not  too  small  backgrounds,  the  near-zero  reflec¬ 
tions  obey  the  approximation 

/p(H)~m(H)/,(H).  (11) 

Combination  of  (6),  (10)  and  (11)  gives 

<’(|Ti(H)l)  =  4i|  F,(H)  | *  [  1  +  m(H)J */2< *(H),  ( 1 2) 

in  which  |  F6(H)  |2  is  the  number  of  background 
counts  per  second  corrected  for  Lp. 

Conclusion:  For  this  category  of  reflections  it  is  the 
background  which  determines  the  standard  deviation! 


The  necessity  of  reducing  the  background 

The  accuracy  required  for  the  individual  reflections 
depends  on  the  property  which  one  wants  to  study. 
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For  the  potential,  the  accuracy  must  increase  with 
[(sin  6)1  A]-2.  For  the  density,  on  the  other  hand,  all 
standard  deviations  should  be  the  same.  Getting  the 
weak  reflections  measured  with  adequate  accuracy 
can  be  a  real  problem  as  (under  the  condition  where 
(12)  is  valid)  increase  in  measuring  time  is  very 
inefficient  to  improve  accuracy.  In  contrast  to  the 
very  strong  reflections  (formula  9)  where  it  costs 
counting  4  times  longer  to  improve  a  by  a  factor  of  2, 
the  present  situation  requires  counting  16  times 
longer!  A  much  more  efficient  way  is  to  reduce  J  Fb  |, 
which  can  be  done  by  monochromatisation  and/or 
collimation  of  the  incoming  and  outgoing  beam. 
Table  1  illustrates  the  reduction  in  background  which 
can  be  achieved  in  this  way. 

In  the  limiting  case  with  |  Fb  (H)  | 2  ->  0,  when 
approximation  (1 1)  no  longer  holds,  a  ( |  F0  (H)  |  ) 
can  be  obtained  directly  from  (6)  and  (10)  with  (3) 
and  (5): 

o  (|  F0 (H)  |)  =  [*//„  (H)]1'4  |  F0  (H)  1 4«/2.  (13) 

In  this  case,  for  equal  standard  deviations,  the 
measuring  times  tn  (H)  are  proportional  to  |  F0  (H)  |2 
and  become  now  very  short  for  the  very  weak 
reflections !  Realisation  of  the  limit  Fb  ->  0  is  also  the 
preferable  way  to  solve  the  problems  arising  for 
reflections  with  negative  I  (H ;  net)  intensities,  the 
omission  of  which  gives  a  bias  in  the  maps. 


Is  counting  statistics  the  only  error  source  ? 

In  a  recent  experiment  on  forsterite,  Mg2Si04,  we 
wanted  to  measure  the  low-order  reflections  with 
extreme  accuracy  to  study  the  deformation  potential 
A  (f>  (r). 

The  conditions  were  such  that  for  the  strong 
reflection  020  an  accuracy  of  0-04%  was  required. 
The  diffractometer  was  programmed  accordingly,  but 
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repeated  measurements  of  the  reflection  revealed  an 
r.m.s.  variation  of  ten  times  larger.  To  discover  the 
reason  for  this  variation  the  reflections  020,  021,  062, 
133  and  211  were  measured  alternatingly.  In  compari¬ 
son  with  the  other  4  reflections,  020  showed  a  strong 
variation  from  —  1-2  to  5-6%  (for  discussion  see 
below).  Results  for  the  other  4  reflections  are  given  in 
Fig.  1.  With  use  of  these  reflections  as  intensity  refe¬ 
rence  reflections,  corrections  according  to  the  full  line 
could  have  been  made.  We  see,  however,  that  apart 
from  the  ‘jump’  in  intensity  there  are  pseudo-random 
variations  which  are  clearly  larger  than  expected  on 
the  basis  of  the  0T  %  statistical  error  aimed  for  in  the 
on-line  experiment.  In  experiments  which  would  have 
aimed  at  a  statistical  accuracy  of,  say,  1%,  may 
be  not  even  the  ‘jump’  and  certainly  not  the  additional 
pseudo-random  variations  would  have  been  noticed 
Nevertheless  they  would  have  been  there,  making  the 
distribution  function  for  the  reflections  different  from 


Fig.  1.  Relative  intensities  of  4  reflections  measured  conse¬ 
cutively  in  a  certain  period  of  time.  Sequence  =  time  axis 
stands  horizontal  (dashed).  Vertical  is  1000  [//</>  —  1]. 
Each  of  the  individual  measurements  has  a  standard  devia¬ 
tion  of  m  0-1  %.  The  figure  shows  a  ‘jump’  within  2  hours  of 
about  1  %  and  moreover  a  spread  around  the  full  line  larger 
than  expected. 
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the  theoretical  distribution  function  based  on  count¬ 
ing  statistics. 

Analysis  of  step-scan  values  for  the  two  extreme 
vlaues  of  reflection  0  2  0  (Fig.  2)  showed  that  this 
discrepancy  is  mainly  due  to  a  non-constant  motor 
speed.  It  may  be  possible  to  remove  part  of  it  by 
apparatus  reconstruction.  But  even  then,  it  cannot  be 
excluded  that  diffractometers  are  subject  to  small 
variations  in  their  electronical  and  mechanical  parts, 
giving  intensities  which  do  not  obey  distribution 
functions  based  on  counting  statistics.  Whether  or  not 
the  deviations  are  important,  depends  on  the  accuracy 
required  for  the  experiment. 


Fig.  2.  Profiles  (6-29  scan)  of  the  two  extreme  situations  of 
reflection  0  2  0.  There  is  some  difference  in  height  but  the 
main  difference  is  found  in  width  (on  an  absolute  scale 
however  being  small,  not  more  than  0.015  degree  discre¬ 
pancy). 
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Abstract 

Three  alternatives  to  R-tests  are  compared  in  a 
computer  simulation  study  of  power  and  robustness: 
Rothstein,  Richardson,  and  Bell’s  jackknife  test  on 
the  R-factor  ratio,  Arvesen’s  jackknife  test  for  the 
correlation  coefficient,  and  Pitman’s  test  for  the  cor¬ 
relation  coefficient  which  uses  Pearson’s  statistic.  It  is 
found  that  unless  one  could  improve  the  approximate 
null-distributions  for  Arvesen’s  and  Pitman’s  test, 
Rothstein  et  aV s  procedure  is  best,  having  simulated 
probabilities  of  Type  I  error  closest  to  the  test’s 
nominal  a  and  being  reasonably  robust  and  powerful, 
for  all  distributions  considered. 

Introduction 

Hamilton’s  test  on  the  R-factor  ratio. 


91  — 


(1) 


where  Rj  and  Rn  are  the  residuals  associated  with 
models  I  and  II,  is  well-known  to  crystallographers  as 
a  means  of  determining  if  there  is  a  significant  dif¬ 
ference  in  the  R  factors,  and  hence  the  goodness  of  fit 
of  the  models  (Hamilton,  1964,  1965). 

Hamilton  was  able  to  derive  the  null-distribution 
of  3i  by  assuming  that  the  structure  factors  are  linear 
in  the  parameters  (2),  and  the  hypothesis  under  test 
is  a  linear  relationship  on  the  parameters  (3). 


F  =  Ax  -j-  €, 
Qx  =  Z, 


(2) 

(3) 
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where  F  =  (|  Ft  |0  —  J  Ft  |c),  J  Ft  |0  and  ]  Ft  \e  are  the 
observed  and  calculated  values  of  the  zth  structure 
factor,  A  =  (d  \  Ft  \  JdXj),  x  =  (A xy),  Xj  is  the  y'th 
parameter  and  A  Xj  is  the  correction  to  be  applied 
to  the  yth  parameter.  The  vector  e(2)  represents  a 
collection  of  random,  non-systematic  discrepancies 
between  the  structure  factor  differences  and  the 
model  values. 

The  null  distribution  of  is  given  by 

&  ~  (b  N. _m>  J(N  -m)+  1  (4) 

where  b  is  the  rank  of  Q  (3),  N  is  the  number  of 
observations,  m  is  the  number  of  parameters  in  the 
least  constrained  model,  and  jr  is  the  F-statistic 
critical  value  at  the  a-level  of  significance.  Derivation 
of  this  result  requires  e  (2)  to  be  independent  and 
identically  normally  distributed. 

Our  objective  in  this  paper  is  to  report  various 
alternatives  to  Hamilton’s  test,  that  is,  discriminate 
between  Models  I  and  II,  where  we  need  not  assume 
linear  models  (2),  or  a  linear  hypothesis  under  test  (3), 
and  where  minimal  assumptions  are  made  concerning 
the  distribution  of  <r  (2). 

Procedures 

Rothstein,  Richardson  &  Bell’s  (1978)  [RRB’s] 
jackknife  test  is  one  such  alternative,  already  applied 
to  crystallographic  problems.  The  test  is  based  on  an 
underlying  model  dealing  with  the  distribution  of  A£ 
for  each  of  Models  I  and  II,  where 

a;  =  v'^T(|f;i„-|f1|1),  (5) 

and  wj  is  the  weight  associated  with  the  zth  obser¬ 
vation;  an  analogous  expression  defines  A*1.  In 
particular,  [Aj,  a]1]  are  assumed  bivariate  distributed 
with  variance  covariance  matrix 

CTf  °I,  if 

orl.lI  ctii  J 


(6) 
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and  Oj  jj  A  0. 

The  null  hypothesis  that  the  two  models  are  equally 
good  descriptions  of  the  observed  data 

Ho- °f  =  ai\>  (7) 

is  tested  versus  the  alternative  that  Model  II  is  the 
better  model 

Ha:  of  >  afr  (8) 

Rejecting  H0  is  evidence  for  rejecting  Model  I. 
RRB’s  procedure  involves  computing 

&t  =  N  In  (M2)  -(N-  1)  In  (&*,),  (9) 

where  means  to  compute  <%2(1)  where  the  ith 
reflection  has  been  deleted  from  the  calculation  of  $?2. 
Their  test  statistic  is  given  by 

Q'  =  &  I  V,  (10) 

where  if  is  the  average  value  of  if_,  (9)  and 

V'2  =  ^  (if-i  -  ^)2I[N(N-  1)].  (11) 

RRB  conjectured  that  under  H0  (7),  Q'  is  approximate¬ 
ly  r-distributed  with  N- 1  degrees  of  freedom,  i.e. 
asymptotically  normal.  This  paper,  in  part,  tests  this 
conjecture  in  computer  simulation  studies. 

Another  viable  alternative  to  Hamilton’s  test 
concerns  the  statistical  correlation  of  linear  combi¬ 
nations  of  A1  and  A11 : 

At  =  A,1  +  Afn,  (12) 

Bt  =  A,1  -  A".  (13) 

H0  (7)  and  Ha  (8)  become  equivalent  to 

Hq"  pAB  —  0,  (14) 

H'a :  pAB  >  0,  (15) 

respectively,  where  pAB  is  the  correlation  coefficient 
for  A  and  B. 

Arvesen’s  (1969)  jackknife  test  for  the  correlation 
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coefficient  is  believed  to  be  effective  in  testing  the 
hypothesis  />=0,  based  on  the  results  of  an  empirical 
study  using  1000  samples  (iV=25,50)  published  by 
Johnson  (1978). 

For  the  sake  of  comparison,  Pitman’s  (1939)  test 
of  the  hypothesis  H'Q  (14)  will  also  be  considered.  This 
test  uses  Pearson’s  correlation  coefficient  r  and  is  a 
‘normal  theory’  procedure;  the  distribution  of  At  and 
Bi  must  be  bivariate  normal.  Accordingly,  it  violates 
a  major  criterion  for  a  viable  alternative  to  Hamilton’s 
test,  that  of  minimal  assumptions  being  made  con¬ 
cerning  the  distribution  of  e  (2). 


Computer  simulation 

We  will  soon  publish  (Bell,  Rothstein,  &  Li,  1982)  an 
assessment  of  the  performance  of  RRB’s  test  of  H0 
(7)  and  Arvesen’s  test  of  (14)  by  a  Monte  Carlo 
simulation  involving  20,000  experiments,  each  gene¬ 
rating  two  samples  (N=  1 2)  of  pseudo  random  numbers 
(A1  and  A11  values)  drawn  from  the  bivariate  normal, 
bivariate  (1  %  and  5%)  Cauchy-contaminated  normal, 
and  bivariate  gamma  distributions.  As  Bell  et  al.  are 
publishing  relevant  technical  details  of  the  simulation, 
only  typical  results,  sufficient  to  generalize  the  relative 
performance  of  the  tests  will  be  cited  here. 

When  the  null  hypothesis  is  true,  a  test  performs 
well  if  the  simulated  significance  levels  a  (probabilities 
of  a  Type  I  error)  are  close  to  the  test’s  nominal  a. 
Bell  et  aV  s  results  (Table  1)  show  that  RRB’s  jackknife 
test  gives  significance  levels  which  are  closer  to  the 
theoretical  values  than  those  obtained  from  Arvesen’s 
procedure  for  each  of  the  three  symmetric  distribu¬ 
tions,  and  although  the  errors  are  comparable  for  the 
skewed  distribution  (bivariate  gamma),  RRB’s  test  is 
more  conservative.  It  is  clear  that  RRB  stay  consistent¬ 
ly  closer  to  the  nominal  a  than  do  Arvesen’s  or 
Pitman’s  tests,  and  the  latter  two  tests  consistently 
reject  H0  (or  //„')  too  often  when  the  hypothesis  is  true. 


Table  1.  Empirical  type  I  error  probabilities  using  20,000  samples  (N=  12). 
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Table  2.  Empirical  power  using  20,000  samples  (N— 12)  with  T2,  ct^=  1-0,  a^y  =  0-9 
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With  the  null  nypothesis  false,  a  test  performs  well 
if  the  empirical  power,  1-/1,  is  large.  (/?  is  the  empirical 
probability  of  accepting  the  false  hypothesis.)  After 
adjusting  the  critical  values  for  each  test  to  make  the 
simulated  a  agree  with  the  nominal  a,  the  empirical 
power  obtained  by  Bell  et  al.  appears  in  Table  2.  Not 
surprisingly,  Pitman’s  test  performs  best  for  the 
bivariate  normal  and  mildly  (1  %)  Cauchy  contami¬ 
nated  normal  distributions,  otherwise  it  has  consistent¬ 
ly  less  power.  For  the  other  distributions,  the  power 
of  Arvesen’s  test  is  consistently  better  than  RRB’s, 
although  only  in  the  third  decimal  place. 

In  conclusion,  if  one  could  improve  the  null 
distributions  for  Arvesen’s  test  and  Pitman’s  test,  the 
former  would  be  the  best  for  the  non-normal  distri¬ 
butions  considered  by  Bell  et  al.  and  the  latter  best  for 
normally  distributed  data.  In  the  absence  of  such 
results,  however,  RRB’s  test  is  the  best  procedure, 
consistently  performing  well  under  H0  and  being 
reasonably  robust  and  powerful. 

I  wish  to  emphasize  that  the  work  described  in  this 
paper  is  drawn  from  collaborations  and/or  discussion 
with  Mr  W.  D.  Bell,  Ms  H.  L.  Gordon,  and  Drs 
W.  -K.  Li,  M.  F.  Richardson,  and  D.  M.  Thompson. 
Our  research  was  supported,  in  part,  by  grants  from 
the  Natural  Sciences  and  Engineering  Research 
Council  of  Canada. 
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The  Residual  Function  R2  as  Discriminator 
Criterion  in  Structure  Determination 

By  A.  T.  H.  Lenstra 

Department  of  Chemistry,  University  of  Antwerp 
( U.I.A. ),  Universiteitsplein  1,  B-2610  Wilrijk,  Belgium 


Introduction 


In  automated  procedures  to  determine  a  crystal 
structure,  one  obviously  needs  criteria  to  discriminate 
between  correct  and  incorrect  structure  models. 
Such  structure  models,  to  be  checked  on  their  relia¬ 
bility,  can  be  obtained  in  many  ways.  Heavy-atom 
analysis,  rotation  and  translation  searches  or  a  direct- 
method  routine  are  the  most  likely  sources  for  tenta¬ 
tive  structure  models.  The  structures  to  be  tested  can 
be  organic  as  well  as  inorganic.  Therefore  the  discri¬ 
minator  function  must  be  very  flexible  in  terms  of 
applicability  in  experimental  situations.  As  a  conse¬ 
quence  mathematical  discriminator  functions  must 
be  preferred  above  e.g.  chemical  criteria,  of  which 
the  applicability  is  restricted  to  one  class  of  chemical 
compounds,  say  organic  structures.  From  our  own 
experience  (Lenstra  1973;  van  de  Mieroop  1979)  we 
know  that  the  heavy-atom  analysis  can  be  successfully 
automated  handling  the  residual  R2  as  a  mathematical 
indicator  function.  R2  is  defined  as 

■r, = y  vN  -  r,f  /  2  *n  (1) 

H  H 

where  the  observed  structure  contains  N  atoms  and 
the  tentative  model  contains  n  (n  <  N)  atoms.  In 
general  the  model  to  be  tested  has  g  atoms  at  correct 
atomic  sites  and  f  atoms  at  incorrect  positions.  This 
situation  will  be  denoted  as  (g,  f)  with  g  +  /  =  n. 

The  functional  behaviour  of  R2  has  been  evaluated 
in  many  papers  (Wilson  1969,  1974,  1976;  Lenstra 
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1973,  1974,  1979;  Parthasarathy  &  Parthasarathi 
1972;  Srinivasan  &  Parthasarathy  1976;  Van  de 
Mieroop  &  Lenstra  1978;  Petit,  Lenstra  &  Van  Loock 
1981;  Petit  &  Lenstra  1981).  For  theoretical  con¬ 
venience  the  crystal  structure  is  regarded  as  a  system 
of  non-vibrating  point  atoms.  This  has  the  advantage 
that  the  scattering  power  of  the  atoms  is  independent 
of  the  Bragg  angle  6.  Due  to  this  (1)  can  be  rephrased 
giving 


R*  =  1  Vn  -  A  Ey  /  2  en-  (2) 


H  H 

n  N 


where 


In  the  rest  of  this  paper  we  discuss  some  properties 
of  R2  or  related  functions.  The  parameters,  which  are 
especially  dealt  with  in  terms  of  their  influence  on  R2, 
are :  the  size  of  the  structure  model  (cr|  ~  n/N)  and 
the  threshold  a.  This  threshold  is  applied  on  Ej^-data 
only,  and  its  introduction  means  that  all  observed 
intensities  with  Efr  <  a  are  omitted  in  the  practical 
computation  of  R2 ■  The  main  reason  to  introduce  the 
parameter  a  is  to  reduce  the  time  necessary  to  enu¬ 
merate  R2,  which  makes  R2  better  applicable  in  prac¬ 
tical  work. 

In  §§  2  and  3,  Rz  is  calculated  as  a  simple  point 
estimator.  The  logical  line  goes  analogous  to  all 
references  cited.  Within  this  framework  we  will 
study: 

(i)  situation  ( n ,  0)  versus  (0,  n).  From  a  mathe¬ 
matical  point  of  view  this  is  the  simplest 
situation.  For  practical  use  this  description  is 
only  of  interest  for  the  traditional  rotation- 
search  procedures  in  reciprocal  space. 

(ii)  situations  (g,  /).  This  is  the  general  description 
apt  to  any  model.  It  will  be  shown  that  R2  ( g,f ) 
is  a  predictable  quantity. 
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For  a  proper  use  of  Rz  in  applied  crystallography 
it  is  evident  that  the  knowledge  of  a  point  estimator  is 
insufficient  but  better  than  nothing.  So  in  §  4  P(P2)  is 
discussed  indirectly.  It  is  shown  that  any  moment 
<((P2  —  P2)9)  can  be  calculated.  For  brevity  only 
the  space  group  PI  is  dealt  with. 

In  §5  rotation  search  is  discussed,  as  an  example  to 
show  the  validity  of  the  present  theory. 


2.  Correct  and  incorrect  models  with  n  atoms  (n  ^  N) 

Replacing  the  summations  in  (2)  by  the  number  of 
observations  times  the  actual  averages,  we  find 

P2=<(^-Or^2)2>/<^>.  (3) 

The  angular  brackets  indicate  an  average  overall 
experimental  intensities  available.  A  reduction  in  the 
number  of  reflections  used  to  calculate  actual  P2 
values  speeds  up  the  application  procedure.  To 
accelerate  the  reliability  test  itself  we  follow  the 
standard  practice  implemented  in  rotation  search 
(e.g.  Tollin  &  Rossmann,  1966),  i.e.  eliminate  all  E2N 
values  below  a  certain  threshold  a.  Then  P2(a)  can 
be  written  as 

RM=({ElN\-2a\  <£&  £»>.+«{  <£}>.)/<£]j,>..  (4) 

R2  is,  of  course,  only  a  proper  mathematical  indicator 
function  if  its  final  numerical  value  remains  pre¬ 
dictable,  though  now  as  a  function  of  the  threshold  a. 
To  obtain  numerical  values  for  R2(a)  the  actual 
averages  in  (4)  are  replaced  by  distribution  averages. 
To  avoid  unwanted  complexity  due  to  the  overwhelm¬ 
ing  number  of  available  intensity  distributions,  we 
have  to  restrict  ourselves.  In  this  paper  we  tackle  the 
problems  in  terms  of  equal-atom  structures  in  the 
space  groups  PI  and  PI.  The  basic  distributions  which 
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we  use  throughout  this  chapter  are  the  Wilson 
distributions: 

P(E)  =  2  Eexp[-E*]  for  PI 

and 


for  PI. 


For  the  sake  of  completeness  it  should  be  noted  that 
these  functions  are  only  strictly  valid  for  structures 
with  a  large  number  of  randomly  located  atoms  in 
the  unit  cell.  Consequently,  these  distributions  are 
asymptotic  approximations  of  the  more  precise 
functions,  given  by  Srinivasan  &  Parthasarathy  (1976), 
in  which  the  number  of  atoms  determines  the  more 
exact  algebraic  formulation. 

2.1.  Incorrect  models  (0,  n);  unrelated  case 

In  this  situation  EN  and  En  are  independent  of  each 
other.  So  the  joint  probability  distribution  P(EN,En ) 
is  simply  the  product  P(EN)P(En).  For  unrelated 
model  (4)  can  be  simplified  to 

jya)=«£]|f>.+»}<£«> -2o\(E%).KE*,ya  (5) 

The  necessary  distribution  averages  are  easily 
obtained,  because 

<-E'>”= J"o  E'PWdE/jva  P^dE- 

The  individual  intensity  moments  are  summarised  in 
Table  1.  Substitution  of  these  moments  in  (5)  gives  for 
the  space  group  PI 

R,(a)=  [a2+2a(l  -a*)+2  {o\-o\+ 1)]  * 

[a»+2a+2]“1  (6) 


Table  1.  Relevant  moments  for  PI  and  Pi  in  function  of  the  threshold  “a”  for  the  two  extreme  situations  (n,  0) 
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<  E*ff  >a  a*  +  2a  +  2  3  4-  (3  -f  a)  Q 
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For  the  space  group  PI  we  find 


—  1  + 

exp  (—a/2)  +  3 


-2  of  erfc  \J  ~  2  °\\J — 
cr^erfc  ★  [3  erfc  /\J  ? 


+  (3  +  a)  A/  —  exp  (-a/2)]-1. 

▼  77 


(7) 


The  functional  behaviour  of  R2(a)  is  visualized  in 
Figs.  1  (PI)  and  2  (PI).  It  is  clear  that  for  a  given 
model  of  size  o*,  P3  depends  strongly  on  the  applied 
threshold. 


Fig.  1.  Rs  for  the  unrelated  case  in  space  group  PI. 


Fig.  2.  R2  for  the  unrelated  case  in  space  group  PI. 
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2.2.  Correct  models  (n,  0);  related  case 


For  correct  structure  models  the  structure-factor 
equation  is 


EAf=E„+E„ 


where  Er  represents  the  unknown  part  of  the  structure 
(size:  N—ri).  Since  en  and  E„  are  now  mutually 
dependent  the  joint  distribution  P(EN,E„)  is  no  longer 
the  simple  product  of  P(EN )  and  P(En).  The  joint 
distributions  we  need  are  (see  Srinivasan  &  Partha- 
sarathy,  1976): 


P(En,Eh)= 

and 

P(EN,En)= 


which  are  derived  for  the  space  group  PI  and  PI 
respectively. 

I0  is  a  modified  Bessel  function  of  the  first  kind  of 
order  zero  and  a\  =  1  —  The  intensity  moments 
necessary  to  evaluate  P2  from  (4)  are  listed  in 
Table  1. 

The  algebraic  derivation  of  these  moments  is 
illustrated  with  one  single  example,  viz. 


_  a  _  J”  Jo  ■ E}_ g; 
B 


(10) 

(a)  The  space  group  PI:  Substituting  (8)  in  (10) 
and  rewriting  the  Bessel  function  as  its  series  (Abra- 
movitz  &  Stegun,  1968),  we  have 
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Substituting  ZN  =  Efr  and  Z„  =  El  and  using  the 
following  standard  integrals 

in 


/oo 

a 


xm  e~nx  dx  —  e~ 


,  V'  m !  am~r 

/Li  (rn—r)!  nr+1 
r=0 


r 

J  0 


n\ 


x "  e~bx  dx  = 

0  bn+1 


for  b  >  0  and  n  positive, 


the  resulting  double  series  £  can  be  shown  to  be 
equal  to:  k  r 

e~a  [o\(a*  +  a  +  1)  +  1  +  a]. 

The  normalisation  factor  B  is  given  by  e~a,  a  result 
easily  found  using  the  marginal  Wilson  distribution 

P  (En). 

Consequently:  (Ej^  E2„)a  =  [a2  (a2  +  a  -f  1)  +  1  +  a] 
as  given  in  Table  1 . 

( b )  The  Space  group  P\:  The  recipe  goes  as  follows. 
Substituting  (9)  in  (10)  and  applying  the  definition  of 
cosh  x  ==  (ex  +  e~x)l 2,  the  numerator  A  is  given  by 

A  =  —  f  °°  u2  exp  [-  -1  f°°  v2 

7 T  J  Vala2  L  2  J  JO 


V  ala-, 

exp 


[-i] 


[exp  (o-j  uv)  -f  exp  (—  ox  uv)\  dv  du, 


where  u 


—  and  z>  =  — . 


a9 


Applying  the  standard  integrals 

f  exp  [—  (a  t2+bt+c)]  dt  =  \  . 

Jo  v  a 

‘^flerfc  (±)  . 

_  a  J  Wal 

*00  (t-z)n 


exp 


i"  erfc(z)  = 


VtT 


/; 


dt. 


n\ 
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Since  B  =  erfc  (again  easily  found  from  the 
marginal  distribution)  we  have 


(E2NE2)a=l+2al+  [l+ol(2+a)]  Q  with 


Substitution  of  the  moments  listed  in  Table  1  in  (4) 
gives  P2(a)  for  correct  models  [type  («,  0)] . 

In  the  space  group  PI  we  find 


R2(a)  =  {a2  (a®  -  2o\  +  1)  +  2a  (-  o\  +  2a\  -  a\ 


-a2+l)  +  2(l-a2)}^{a2+2a+2)}-1.  (11) 

The  behaviour  of  P2  for  correct  models  in  the  space 
group  PI  is  algebraically  given  by 


exp  (-  ★  ^3  erfc  ^  +  (3+a)  N J ^ 


(12) 


The  functional  behaviour  of  P2  for  correct  models  in 
Pi  and  PI  is  depicted  in  Fig.  3  and  Fig.  4  respectively. 

From  a  crystallographic  point  of  view  the  most 
interesting  series  of  models  are  the  correct  ones  ( n ,  0). 
The  correspondence  between  theory  and  experiment 
is  illustrated  in  Tables  2  and  3.  As  example  of  a  non- 
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Q 

Fig.  3.  Rt  for  the  related  case  in  space  group  PI. 

i-0 

0-8 

0-6 

R2 

0-4 

0-2 

0 

O  2-0  4-0  6-0 

O 

Fig.  4.  R2  for  the  related  case  in  space  group  PI . 

centro symmetric  structure  we  used  ammonium- 
hydrogen  1-malate  (Versichel,  Van  de  Mieroop  & 
Lenstra  1978),  space  group  P212121,  maximum 
Bragg-angle  0=27°(MoKa).  Ascentrosymmetric  struc- 
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Table  2.  Comparison  of  experimental  R2  values  with  the 
theoretical  ones  as  a  function  of  the  threshold  ‘ a ’ 
for  a  non-centrosymmetric  structure. 

all  measures  the  size  of  the  known,  correct  structure  fragment. 
Here  a'i  —  0-9  corresponds  to  a  model  with  9  out  of  10  atoms 
in  the  molecule.  The  asterisk  indicates  a  local  minimum  in  Rt 
as  a  function  of  a. 


°1 

=  0-9 

2 

(7i  = 

0-5 

a 

100  x 

100  X 

100  x 

100  x 

R2  (exp) 

R2  (theor) 

R2  (exp) 

R2  (theor) 

0-00 

9-44 

10-00 

54-10 

50-00 

0-20 

9-26 

9-76 

52-91 

49-08 

0-40 

9-11 

9-44 

52-80 

48-65 

0-60 

8-95 

9-08 

52-38* 

48-53* 

0-80 

8-94 

8-73 

52-62 

48-58 

1-00 

8-74 

8-40 

52-78 

48-75 

1-20 

8-43 

8-09 

53-48 

48-97 

1-40 

7-49 

7-81 

54-31 

49-22 

1-60 

7-45 

7-56 

54-43 

49-48 

1-80 

7-44 

7-33 

54-65 

49-75 

2-00 

7-30 

7-12 

54-93 

50-00 

3-00 

6-10 

6-33 

52-92 

51-10 

4-00 

5-66 

5-82 

52-00 

51-92 

5-00 

5-56 

5-46 

52-34 

52-53 

6-00 

5-81 

5-20 

55-00 

53-00 

ture  we  used  cis,  cis-4,  6-dimethyl  trimethylenesulfite 
(space  group  P2  Jc;  Petit,  Lenstra  and  Geise,  1978),  in 
which  all  actual  atoms  were  replaced  by,  say  nitrogen 
atoms,  to  obtain  an  equal-atom  structure. 

The  correspondence  between  theory  and  practice  is 
very  satisfactory. 


3.  The  general  model  (g,  f) 

Suppose  we  have  calculated  a  tentative  electron- 
density  distribution  in  which  n  maxima  are  found. 
With  respect  to  the  observed  vV-atom  structure  this 
tentative  model  contains  g  atoms  at  their  correct 
position,  but  /  atoms  (n—g=f)  are  totally  misplaced. 
The  question  is  clear:  ‘Are  we  able  to  calculate  the 
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Table  3.  Comparison  of  experimental  R2  values  with  the 
theoretical  ones  as  a  function  of  the  threshold  ‘a’ 
for  a  centrosymmetric  structure. 

crl  measures  the  size  of  the  known,  correct  sturcture  fragment, 
e.g .  ci  =  01 1  corresponds  here  to  a  model  with  1  out  of  the 
9  atoms  in  the  molecule.  NO  represents  the  number  of  reflec¬ 
tions  taken  in  the  calculation  of  R2.  0max  =  30°  (MoKa). 


o'i  = 

0-11 

2 

^1 

=  0-56 

a 

NO 

100  x 

100  x 

100  x 

100  x 

R2  (exp) 

R2  (theor) 

R2  (exp) 

R2  (theor) 

0-0 

1561 

93-92 

92-18 

50-86 

52-68 

0-2 

957 

93-83* 

91-98* 

49-45 

50-60 

0-4 

753 

93-87 

92-07 

49-30 

49-87 

0-6 

621 

93-98 

92-23 

49-08 

49-40 

0-8 

532 

94-20 

92-40 

48-83 

49-05 

10 

455 

94-86 

92-59 

48-26 

48-79 

1-2 

387 

95-00 

92-78 

48-16 

48-59 

1-4 

344 

95-02 

92-96 

48-15 

48-43 

1-6 

308 

95-65 

93-13 

47-86 

48-30 

1-8 

264 

95-71 

93-30 

47-84 

48-19 

2-0 

240 

95-83 

93-45 

47-78 

48-10 

3-0 

149 

96-30 

94-12 

47-70 

47-84 

4-0 

86 

96-61 

94-61 

47-68 

47-72 

50 

58 

96-64 

95-00 

47-68 

47-66 

R2  value  theoretically  for  a  model  ( g,f)T •  The  answer 
to  this  question  is  positive.  The  procedure  to  predict 
R2  [a,  ( g ,  /)]  is  in  fact  quite  simple.  All  we  have  to  do 
is  to  evaluate  again  the  moments  indicated  in  (4).  As 
an  example  we  will  discuss  one  simple  term,  notably 

The  normalised  intensity  for  our  n-atom  model  is 
given  by: 

n  n 

£”=n[Q> cos  27rH-rJj2+  (I  sin  277H  r, 

7=1  7=1 

Handling  normalised  intensities,  any  subset  of  atoms 
— in  our  nomenclature  N,  n,  g  or  f — one  always  has 
<  E%  >  =  {ED  =  < E*  >  =  < EJ)  =  1.  So  describing 
the  structure  as  a  system  of  equal  point  atoms  the 
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scattering  power  of  each  atom  is  inversely  proportion¬ 
al  to  the  square  root  of  the  number  of  atoms  one  uses 
in  the  E  calculation.  Then  the  normalised  intensity  of 
our  n-atom  model  can  be  written  as 

El  =  g-E\+  f-E\. 
n  n  1 

Since  Eg  and  Ef  are  not  interrelated,  a  simple  averaging 
gives 


Introducing  the  threshold  a  on  the  observed  intensities 
does  not  influence  (E^)>  at  all.  So 

=  (13) 

Analogously  we  obtain: 

<^>.  =  *<^1).  +  (14) 

m  n 2  n 2 

where  a=4  or  6  for  the  space  group  PI  and  Pi 
respectively. 

Substitution  of  these  moments  in  (4)  gives  R2[a,(g,f)] 
algebraically.  The  values  of  etc  are  directly 

available  in  Table  1  using  a\  =  gjN.  To  calculate 
P2  we  substitute  these  moments  in  (4),  where  due 
to  the  size  n  of  our  model  <r|  remains  n/N. 

This  description  was  verified  using  computer- 
simulated  experiments.  Typical  examples  for  PI  and 
Pi  are  listed  in  Table  4. 

The  experimental  <(P2)-  values  given  are  the 
experimental  averages  of  200  P2-values  generated  for 
200  different  structures  each  containing  100  atoms  in 
the  unit  cell  and  described  by  2000  reflections.  Con¬ 
vergence  showed  that  R2  averaged  over  200  structures 
is  stable  up  to  the  third  digit. 


Table  4.  Comparison  of  ‘experimental’  Revalues  against  the  theoretical  ones  as  a  function  of  the  threshold  “ a " 
for  a  non-centro symmetric  and  centrosymmetric  structure.  ‘ Standard  deviations'  of  the  experimental 
{Rf)-values  are  shown  in  parentheses. 
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4.  The  moments  of  R2 


To  decide  whether  a  tentative  model  has  to  be  accepted 
or  not  we  have  now  a  theoretical  R2  value  at  our 
disposal.  The  disadvantage  is  clear:  point  estimators 
are  of  limited  use.  A  proper  statistical  decision  can 
only  be  formulated  if  the  distribution  P(P2)  is  known. 
Direct  algebraic  derivation  of  this  function  is  not 
possible.  However,  an  indirect  approach  appeared 
possible,  viz.  the  enumeration  of  all  moments  of  R2. 
For  brevity  we  will  only  deal  with  models  of  the  type 
(n,  0)  and  (0,  n )  in  one  single  space  group,  namely  PI 
and  exclude  effects  due  to  threshold. 

In  the  previous  sections  the  averages  (Eqy  H  were 
replaced  by  intensity  distribution  averages.  In  the 
derivation  of  the  Wilson  distribution  functions  one 
takes  the  atomic  coordinates  as  primitive  variables. 
This  means  that  (EayH  is  replaced  by  < E9'yT .  For 
infinite  data  sets  these  averages  are  indeed  equal. 
Unfortunately  this  equality  becomes  an  approximation 
handling  finite  data  sets.  This  is  one  reason  why  the 
logic  used  in  the  previous  sections  cannot  be  used 
beyond  the  level  of  a  first  central  moment.  A  second 
problem  we  have  to  cope  with  is  that  R2,  etc.  are 
related  to  one  single  observed  structure  of  size  N 
instead  of  to  some  sort  of  average  A-atom  structure. 
This  means  that  P(EN )  is  no  longer  representative  for 
our  actual  problem;  this  distribution  has  to  be 
replaced  by  { EN  ( H)yK  as  a  set  of  fixed  quantities  des¬ 
cribing  our  observed  structure.  For  reasons  of  mathe¬ 
matical  simplicity  we  take  the  finite  data  set  as  an 
aselect  subset  of  the  whole,  infinite  data  set.  As  a 
consequence  equation  (4)  does  not  hold  anymore. 
It  has  to  be  replaced  in  terms  of  the  above  outlines  by: 


(R2;{EN(H)}Ky  =  1  + 


2<Enl  {EN(H)}K> 
o\!i _ 

ZEfrH) 


ZE*N(El-,{EN(.H)}Ky 

r >  2  H _ 

EE%{H) 


(16) 


C.  S.— 14 
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with  {E^\h  C  {£y} K.  To  calculate  the  moments  indi¬ 
cated  in  (16)  we  have  to  know  the  conditional  pro¬ 
bability  function  P[En(H)\-{E N(H\  K] .  The  formulation 
of  this  conditional  probability  has  to  reflect  the  fact 
that  the  proposed  n-atom  model  is  correct  or 
incorrect. 


4.1.  Incorrect  structures  of  type  (0,  ti) 


Since  the  correlation  between  every  proposed  n- 
atom  model  and  the  observed  structure  is  absent,  we 
have 


<£«#);  {£N(ff))jf>,.  =  (17) 


Using  the  asymptotic  Wilson  distribution  2  En 
exp  [—El]  the  right-hand  side  of  (17)  can  be  enume¬ 
rated  as 

<£2(tf)>r„=  r(|  +  l).  (18) 

Substitution  of  these  moments  in  (16)  gives 


<*2>  r„=l-K 


2  2 
H 


?.  En[H) 


H 


-2o\ 


EE, 
2  " 


N 


m 


N' 


w 


(19) 


This  function  describes  R2  for  a  particular  set  of 
structure  factors  of  the  observed  N-atom  structure. 
It  is  also  evident  that  the  introduction  of  a  threshold 
a  is  not  very  difficult.  Equation  (6)  has  to  be  a  limiting 
case  of  (19).  This  can  be  easily  shown.  When  we  are 
interested  in  an  overall  picture  of  R2,  applicable  to 
any  observed  N-atom  structure,  we  need  to  average 
over  all  observable  structures  of  N-atoms.  We  then 
find 


«*2>rfl>r„=l  +«*-«£ 

which  is  indeed  equal  to  (6)  with  a  threshold  a— 0. 

This  simple  experiment  suffices  to  show  that  the 
results  of  the  previous  sections  are  a  special  case  in  the 
improved  concept  developed  in  this  part. 
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The  central  moments  fiq  of  R2  are  given  by 


*>,  =  -  <^»s>- 

In  terms  of  the  geometrical  moments  the  central 
moments  are  algebraically  expressed  by 


q 

I 

7=0 


After  some  manipulation  we  find 
q  q-i  q-i-j 

i=0  7=0  k—0 


V  V  V  g!(-i)»t‘  y*+‘ 

^  ZL  Zl  i\j\k\(q—i—j—k)\  1 
1=0  7=0  &=0 

2  £^+k)  <£2<2«-2.-27-k>> 

//  H 

From  (18)  we  know 

(E%m~2i~2J~k)y=(2q—2i—2j—k)\ 

Let  9=2,  i.e.  we  calculate  cr2(i?2).  The  resulting 
expression  reads 


<*W>rn=  (4agF^-16ag£ 

7/  // 


2 

'TV 


+20*f]>  1  )/(ZE%)\ 

H 

where  1  =E°jq.  We  see  that  o-2(i?2)  is  inversely  propor¬ 
tional  to  the  number  of  observations,  which  cor¬ 
roborates  with  one’s  intuitive  expectation.  Of  more 
direct  use  is  the  fact  that  for  practical  purposes 
P(R2)  can  be  regarded  as  a  Gaussian  distribution. 
In  table  5  the  values  of  R2,  cr(/?2)  and  the  skewness 
(71=/l43/ct3)  are  listed  for  a  structure  of  500  atoms  with 
10  000  reflections.  The  numbers  are  calculated  using 
the  additional  averaging  over  in  order  to  obtain 
an  overall  picture  rather  than  a  specific  one.  The 
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kurtosis  [(ju,4/<r4)— 3]  is  ca.  —2-99,  which  means  that 
P(R2 )  is  platykurtic. 

4.2.  Correct  models  of  type  (n,  0) 

To  calculate  Rz  using  (16)  we  need  to  know  the 
intensity  moments  { EN(HffK ).  These 

moments  are,  however,  too  complicated  to  obtain 
without  an  additional  assumption.  Let  us  ignore  the 
correlation,  if  any,  between  the  intensities  of  the 
reflections  H  and  K.  Then  the  required  intensity 
moments  can  be  simplified  to:  < Eqn(H)i  En(H )>. 
As  far  as  practical  results  are  decisive  in  underlining 
a  theoretical  concept,  the  final  tests  show  that  this 
approximation  holds  in  its  consequences. 

To  calculate  <(£4;  EN)  we  need  to  construct  the 
conditional  probability  function  P(En;  EN).  Fortu¬ 
nately  for  correct  models  the  conditional  probability 
function  P(EN;  E„)  has  been  derived  (Srinivasan  & 
Parthasarathy,  1976).  This  function  is  given  by 


N 


n 


with  <x*N  =  J  ft  and  <7^  =  ^  ft- 


To  get  the  necessary  distribution  P(En;EN )  we  apply 
the  theorem  of  Bayes 


P(En;EN)=  p(EN;En)p(E>d 

p(en) 
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Substitution  of  the  relevant  distributions  at  the  right- 
hand  side  gives 


The  moments  of  this  distribution  are  given  by 

<£;;£jv>  =r (|+ 1 )  »« lfl  ( - 1 ;  1 ;  -  d ^),  (20) 

where  1F,(jc;y;z)=l  +  -  z  + 

y  Xt+1)  2! 

a  so-called  hypergeometric  function. 

Substitution  of  the  relevant  moments  in  (16)  gives 

<**>r.  =  {(1  -  °D  +  2  a\  (1  -2  a{)  (oj- 1)  * 

z  E%+  2  af  (1  -  ol)  271}* 

For  correct  models  we  calculated  the  q th  central 
moment  too.  After  some  manipulations  we  find 

q  q-i  q-i-j  q—i—j—k 

-ill  I  1 

i=0  j= 0  k= 0  1=0 

=  g!(-  l)l+i  2‘+i+J  m-j-i) 

i\j\  k\  /!  (q-i-j^k-Oi  1 

(1  -  a*)J+a<  (2  -  al)k  (1  -  2al)J 

jf2(2 k+J+l)  ^£'2(2«-2i-2j-2fc-/)  .  £•  y  ?  * 

{(I  *v)T- 

H 
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Since  only  the  even  moments  of  En  are  required,  the 
confluent  hypergeometric  function  [see  (20)]  can  be 
replaced  by  a  finite  polynomial.  Therefore  we  have 


m 


<E\ 


2m  , 

n  * 


'N 


P= 0 


m!  ml 


2p 


p\p\  ( m-p)\ 


(1  - 


Taking  q=2  we  find  a2(7?a) 

°2(Rd  =  {(1  -  °t)  (8a]4  -  16a]°  +  H)ZE$ 

+  (1  -a2)2  (52a]2  -  48 a]  +  4 o\)2E* 

+  (1  -  a2)2  (80a]°  -  1 6a]6)  ZE* 

+  20a]  (1  -  a])4  21  }l(ZE*f. 

In  Table  6  i?a,  a(i?2)  and  the  skewness  are  given  for 
an  ‘overall’  structure  of  500  atoms  characterised  with 
10  000  reflections.  The  value  of  the  kurtosis  (/x4/a4— 3) 
is  again  ca  —2-99.  Once  more  we  can  decide  that 
P(R2 )  for  correct  structure  models  can  be  regarded 
for  practical  purposes  as  a  Gaussian  distribution. 

Using  the  concept  of  a  Gaussian  function  the  first 
two  moments  (  R2  )  and  o(R2 )  are  sufficient  to 
describe  P(R2)-  We  tested  this  in  a  practical  situation, 
taking  N-acetyl-allohydroxy-L-proline  lactone  (Lens- 
tra,  Petit  &  Geise  1979)  as  an  example.  The  asym¬ 
metric  part  of  the  unit  cell  was  taken  as  a  PI  structure 
in  which  the  original  C7H9N03-moiety  was  replaced 
by  Xn,  where  X  has  a  scattering  power  of  l/Vll. 
Each  model  of  n-atoms  was  calculated  in  all  possible 
(]x)  ways.  This  gave  for  every  model  of  «-atoms 
three  final  i?-values,  viz.  its  minimum,  its  maximum 
and  its  averaged  values.  The  agreement  between 
theory  and  practice  is  illustrated  in  Figs.  5,  6  &  7. 

Inspection  of  these  functions  reveals  an  astonishing 
property  of  a(i?2).  Its  numerical  value  for  models  of 
the  type  ( n ,  0)  hardly  changes  when  ~  80  %  of  the  data, 
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Fig.  5.  0  ^  E2  Experimental  points  are  given  by  circles,  solid 
lines  represent  theoretical  values. 


Fig.  6.  Data  set  1  ^  E1  Experimental  points  are  given  by 
circles,  solid  lines  represent  theoretical  values. 
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Fig.  7.  Data  set  2  ^  E*  Experimental  points  are  given  by 
circles,  solid  lines  represent  theoretical  values. 


notably  the  low  intensities,  are  omitted  from  the 
calculations.  This  surprising  feature  makes  it  look 
profitable  to  use  only  the  large  intensities  to  test  the 
reliability  of  a  tentative  structure.  To  prevent  over- 
optimistic  applications,  one  has  to  look  at  the  conse¬ 
quences  of  this  behaviour.  This  is,  at  least  at  one 
dominant  point,  worked  out  in  more  detail  by  Petit  & 
Lenstra  (1982). 

R2  and  its  behaviour  for  correct  models  is  sometimes 
not  at  all  simple.  One  expects  that  R2  will  have  values 
between  1  and  0.  This  is  not  true!  Let  us  take  those 
reflections,  which  in  direct  methods  are  used  to 
calculate  T-zero.  Using  the  same  structure  that 
served  to  calculate  the  experimental  points  in  the 
previous  Figs.  5,  6  &  7  we  obtain  Fig.  8  using 
error-free  EJ^-values.  Again  theory  and  experiment 
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show  the  same  features,  though  now  qualitative  and 
not  quantitative.  At  present  this  is  a  matter  of  further 
investigations.  The  main  lesson  at  this  point  is  that  it 
summons  the  question  ‘Is  psi-zero  truly  a  nice 
indicator  function?’.  Looking  at  Fig.  8  our  present 
guess  is:  ‘It  is  not!’.  If  it  is,  please  explain  why. 

5.  A  first  application  of  the  theory:  rotation  search 

The  purpose  of  a  rotation  search  is  to  locate  a  known 
molecular  fragment  in  an  orientation  [C]  in  a  unit 
cell  coinciding  with  the  one  of  the  observed  N-atom 
structure.  The  rotation  search  is  always  performed  in 
the  triclinic  space  groups  P 1  and  p\.  This  means  that 
rotation  search  is  a  fine  example  of  the  situations 
(0,  n)  and  ( n ,  0). 

In  reciprocal  space  the  measure  of  fit  between 
model  and  observed  structure  is  given  by: 

R(C)  =  <j\2hE2nEI(C)  (21) 

which  is  simply  the  double  product  term  existing  in  the 
definition  of  R2. 


Fig.  8.  Data  set  0  <K<  0-5  Experimental  points  are  given 
by  circles,  solid  lines  represent  theoretical  values. 
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In  Fig.  9  two  histograms  are  shown,  in  which 
ammonium  hydrogen  malate  (Versichel,  van  de 
Mieroop  &  Lenstra,  1978)  served  as  test  compound. 
Both  histograms  are  practically  the  same  in  spite 
of  the  difference  in  the  sampling  technique.  7?max 
corresponds  to  the  correct  orientation.  All  other 
.R-values  correspond  to  incorrect  orientations.  Look¬ 
ing  at  these  histograms  we  see  that  P(R)  is  not 
Gaussian.  There  is  definitely  some  tailing  in  the 
direction  of  7?max.  This  tailing  effect  can  be 
easily  explained  in  terms  of  partially  correct  oriented 
models.  By  this  we  mean  that  the  chemically  known 
fragment  is  well  oriented  with  exception  for  some  of 
its  substituents.  Using  the  terminology  of  §3,  we  have 


R(gJ)a  =  NO 


-  { +  (i  +  ci  +  i  + 


+  /  (1  +  a) 

n 


e~ao\ 


(22) 


Fig.  9.  Histograms  of  experimental  i?-values  in  a  rotation 
search  of  ammonium  hydrogen  (l)-malate  with  a  10-atom 
search  fragment  using  the  Eulerian  angle  sampling  technique 
(left)  and  the  optimal  sampling  technique  of  Lattman  (1972). 
To  make  comparison  easier,  the  number  of  sampling  points 
in  both  techniques  has  been  scaled  to  the  same  value. 
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for  the  space  group  PI,  and 


R(g,f)a  =  NO 


-{1  +  2  o-j  +  (1  +  (2  +  a)c^)Qy 


+  -e  +  Q) 

n 


erfc  V-o},  (23) 


for  the  space  group  PI,  where  =  g/jV  and  <2  as  in 
Table  1  and  NO  is  the  effective  number  of  observa¬ 
tions. 

The  proper  formulae  for  R(n,  0)  and  P(0,  n)  are  easy 
to  obtain.  For  the  space  group  PI  we  have: 

R  (0,  n)a  —  NO  (1  +  a )  e~a  of, 

R  (n,  0)a  =  NO  [1  +  o\  +  (1  +  a\)  a  +  al  a2]  e~a  a2. 


A  numerical  verification  is  summarised  in  Table  8, 
where  P(u,0)/P  (0,«)  is  given  as  well  as  the  experi¬ 
mental  ratio  of  Pmax/Pave  in  which  Pave  is  the 
experimental  average  of  R  over  all  data  points.  Let  us 
concentrate  on  incorrectly  oriented  models.  Following 
the  description  in  §4 


a2(P)  =  2  E%  for  R  =  a\SEl  K  (24) 

H 


Table  7.  Experimental  approach  to  Rf-distribution  for 
ammonium  hydrogen  ( \)-malate  with  a  10 -atom 
search  fragment.  Z=Pave— Pmin. 


a 

Z 

00 

70-6 

1-0 

74-7 

2-0 

61-3 

30 

66-4 

4-0 

55-0 

5-0 

46-2 

6-0 

38-1 

Z/a 

Z/cr 

of  (24) 

a  of  (25) 

2-60 

3-29 

2-76 

3-64 

2-48 

3-48 

2-87 

4-76 

2-61 

5-26 

2-45 

6- 10 

2-37 

7-15 
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Consequently  o\R)  for  a  particular  structure  to  be 
determined,  can  be  calculated  prior  to  an  actual 
rotation  search.  When  P(R )  can  be  taken  as  Gaussian, 
this  means  thatZ=(jRave— 7?^)  willbe  roughly  3  times 
o(R)  for  the  ca.  1400  sampling  points  depicted  in 
Fig.  9.  In  Table  7  the  numerical  value  of  Z  is  given  as 
a  function  of  the  threshold  a.  Using  (24)  the  ratio 
Z/ct  is  also  shown  with  the  threshold  as  parameter. 
With  our  present  knowledge  we  are  able  to  formulate 
the  rotation  search  in  such  terms,  that  the  correct 
orientation  can  be  found  at  minimal  computer  costs 
due  to  an  ‘error-analysis’  prior  to  the  actual  calcula¬ 
tions. 

The  charm  of  this  example  is  also  that  it  allows  us  to 
illustrate  explicitly  the  consequences  of  gaining  an 
overall  picture.  To  get  an  overall  picture  of  a(R)  we 
replace  2  in  (24)  by  its  statistical  value.  We  then 
replace 

<«a  (*)>,,  by  <  <*)>,„  >,„• 

Table  8.  Comparison  of  theory  and  experiment  in  the 
fion-centrosymmetric  case  as  a  function  of  the 
threshold,  with  a  search  fragment  of  size  o\  =0-234. 


a  NO  R(n,  0)/R(0,  n) 


actual 

theory 

exp. 

theory 

00 

3677 

3677 

1-28 

1-23 

0-2 

2770 

3010 

1-29 

1-24 

0-4 

2153 

2465 

1-31 

1-26 

0-6 

1757 

2018 

1-33 

1-29 

0-8 

1454 

1652 

1-35 

1-32 

10 

1188 

1353 

1-38 

1-35 

1-2 

976 

1107 

1-42 

1-39 

1-4 

795 

907 

1-46 

1-43 

1-6 

672 

742 

1-48 

1-46 

1-8 

560 

608 

1-50 

1-50 

2-0 

454 

498 

1  53 

1-55 

30 

278 

183 

1-61 

1-76 

40 

159 

67 

1-71 

1-98 

5-0 

97 

25 

1-82 

2-21 
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(24)  then  becomes 

cr2  ( R )  =  o\  NO  <rfl  (a2  +  2a  +  2)  for  PI  (25) 


and 

o2  ( R )  =  a*  NO  erfc  [3  +  (3  +  a)  Q\  for  Pi. 

Taking  a  from  (25)  we  also  calculated  Zja.  For  quali¬ 
tative  purposes  the  values  are  useful,  because  they  are 
of  the  same  magnitude  as  the  ones  tailored  to  our 
example.  For  applications,  it  is  clear  that  (24)  is  of 
much  more  value  than  (25).  The  large  discrepancies 
in  ammonium  hydrogen  1-malate  between  o-(24)  and 
a  (25)  are  due  to  the  fact  that  P  (EN)  has  a  centric 
character.  This  is  shown  in  Table  8,  where  NO  e~a  is 
compared  with  the  actual  number  of  F2- values 
above  a  threshold  a. 
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Alternatives  to  Least  Squares 

By  A.  J.  C.  Wilson 

Department  of  Physics,  University  of  Birmingham, 
Birmingham  B\5  ITT,  England 


Editorial  Note 

Least-squares  adjustment  is  undoubtedly  the  com¬ 
monest  method  of  estimating  parameters,  but  it  is  not 
the  only  one.  In  crystallography  Fourier  methods 
have  been  much  used,  and  in  statistics  in  general  the 
method  of  maximum  likelihood  is  probably  the  main 
rival  of  least  squares.  It  was  hoped  to  have  a  review 
paper  on  ‘alternatives  to  least  squares’  for  this  sym¬ 
posium.  Unfortunately  none  of  those  invited  was  able 
to  accept,  though  there  were  contributed  papers 
related  to  the  topic;  two  of  them  are  reproduced  below 
(pp.  229,  269).  It  may  be  useful  to  try  to  put  some 
of  the  problems  in  perspective. 

Naive  least-squares  gives  unit  weight  to  observa¬ 
tions.  Least-squares  adjustment  of  the  observed  and 
calculated  intensities  is  then  exactly  equivalent  to 
obtaining  the  best  fit  between  the  Patterson  density, 
as  represented  by  the  observed  intensities,  and  the 
Patterson  density  calculated  from  the  model  struc¬ 
ture.  Statistical  fluctuations  in  the  observations  do  not 
bias  the  structural  parameters.  With  certain  reserva¬ 
tions,  it  may  be  said  that  least-squares  refinement 
based  on  structure  factors  is  equivalent  to  making  a 
least-squares  fit  between  the  observed  and  the  cal¬ 
culated  electron  densities,  but  the  parameters  obtained 
would  be  subject  to  some  bias.  Orthodox  least-squares 
uses  weights  inversely  proportional  to  the  estimated 
standard  deviations,  and  if,  as  is  usual,  the  estimates 
depend  on  actual  observations,  the  resulting  para¬ 
meters  are  subject  to  bias,  probably  not  quite  negli¬ 
gible  for  scale  and  thermal  parameters  in  work  of 
ordinary  accuracy,  and  not  quite  negligible  for 
C.S.— 15 
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structural  parameters  in  work  of  the  highest  accuracy. 
Such  refinements  are  equivalent  to  obtaining  a  fit 
between  the  ‘observed’  Patterson  or  electron  densities 
distorted  by  the  statistical  weights,  and  similarly 
distorted  calculated  densities.  These  matters  have 
been  discussed  in  greater  detail  elsewhere  (Wilson 
1976a,  b,  1978).  Various  non-orthodox  weighting 
schemes  have  been  proposed  or  used;  besides  the 
‘robust/resistant’  and  others  described  below,  refer¬ 
ence  may  be  made  to  papers  by  Nielsen  (1977),  Davis, 
Maslen  &  Varghese  (1978)  and  Rees  (1978).  Bias  has 
not  been  investigated. 

Maximum-likelihood  methods  depend  on  a  know¬ 
ledge  of,  or  an  assumption  about,  the  distribution 
function  of  the  statistical  fluctuations  and  other 
random  errors  in  the  observations,  whereas  least- 
squares  methods  depend  on  little  more  than  a  finite 
variance  for  these  errors.  If  the  distribution  function 
were  known,  maximum  likelihood  would  give  the 
better  estimate,  as  it  incorporates  more  information. 
In  general  the  distribution  function  is  not  known ;  all 
that  can  be  said  with  reasonable  certainty  is  that  for 
intensity  measurements  the  distribution  is  only 
approximately  Gaussian,  and  that  large  fluctuations 
are  more  frequent  than  would  be  expected  for  a 
Gaussian  distribution  (for  the  theory  see  Wilson, 
1980;  for  empirical  evidence  see  de  Boer,  this  volume, 
p.  179-186).  At  least  two  crystallographic  applica¬ 
tions  of  maximum  likelihood  have  been  made: 
Beu  (Beu,  Musil  &  Whitney,  1962,  and  several  later 
papers  with  various  collaborators)  used  it  for  lattice- 
parameter  determination,  and  Price  (1979)  has 
proposed  to  use  it  for  structural  parameters.  For  a 
Gaussian  distribution,  as  assumed  by  Beu,  Musil  & 
Whitney,  maximum  likelihood  is  practically  equi¬ 
valent  to  least  squares  (Hamilton,  1964,  Bard,  1974). 
The  case  for  likelihood  methods  has  been  persuasively 
argued  by  Edwards  (1972),  and  the  crystallographic 
applications  have  been  discussed  by  Mandel  (1980) 
and  Wilson  (1980). 
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Abstract 

A  refinement  technique  is  ‘robust’  if  it  works  well  over 
a  broad  class  of  error  distributions  in  the  data,  and 
‘resistant’  if  it  is  not  strongly  influenced  by  any  small 
subset  of  the  data.  Least  squares  possesses  neither 
property.  A  more  robust/resistant  procedure  is  to 
minimize,  instead  of  a  simple  sum  of  squared  differenc¬ 
es,  a  sum  of  terms  of  the  form  (x2/ 2)  [1  —  (x/a)2  -f- 
(1/3)  (x/a)4]  for  |  x  |  <  a  and  a2/6  for  |  x  |  >  a.  Here 
x  =  w1/2  ( |  F0 1  —  J  Fc  |  )/s,  s  is  a  measure  of  the  width 
of  the  error  distribution  based  on  the  results  of  the 
previous  cycle,  and  a  is  a  constant  chosen  so  that 
extreme  data  do  not  influence  the  solution.  This 
function  behaves  like  the  sum  of  squares  for  small 
|x|,  but  is  constant  for  large  ]x|,  so  that  the  effect  of 
large  differences  is  deernphasized.  A  least-squares 
program  can  easily  be  modified  to  perform  this  more 
robust/resistant  procedure.  The  modified  procedure 
has  been  used  in  a  reanalysis  of  the  £>(-+-)  —  tartaric 
acid  data  collected  by  the  Single  Crystal  Intensity 
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Project  of  the  International  Union  of  Crystallography 
[Abrahams,  Hamilton  &  Mathieson  (1970),  Acta 
Cryst.  A26, 1-17).  The  results  show  that  the  technique 
provides  an  efficient  means  for  automatic  screening 
of  a  data  set  for  discrepant  data  points.  It  gives  results 
in  agreement  with  the  least-squares  results  for  good 
data  sets.  If  the  results  do  not  agree  with  least  squares 
it  suggests  systematic  effects.  A  detailed  analysis  of 
residuals  may  identify  the  problem  and  help  to 
determine  whether  the  robust/resistant  refinement  is 
an  improvement. 


Introduction 

A  technique  for  fitting  a  theoretical  model  to  a  set  of 
experimental  data  points  and  estimating  the  best 
values  of  adjustable  parameters  in  the  model  is  said 
to  be  ‘robust’,  or,  more  precisely,  ‘robust  of  efficiency’, 
if  the  parameter  estimates  have  near  minimum  vari¬ 
ance  for  a  broad  class  of  distributions  for  the  errors 
in  experimental  data.  A  technique  is  ‘resistant’  if  the 
estimates  are  not  highly  dependent  on  any  small 
subset  of  the  experimental  data.  To  date,  all  techniques 
that  are  robust  of  efficiency  are  also  resistant.  Some 
data  analysts  feel  that  this  is  inevitable,  hence  the 
splice  word  robust/resistant. 

The  technique  of  least  squares,  the  one  most  com¬ 
monly  used  for  refining  crystal  structures,  is  neither 
robust  nor  resistant.  Least  squares  was  designed 
specifically  by  Gauss  for  an  idealized  distribution  of 
errors — an  error  distribution  now  referred  to  as 
Gaussian.  Typically,  the  errors  in  experimental 
crystallographic  structure  factors  are  not  Gaussian. 
Inadequacy  of  the  structure  factor  model  induces 
correlation  and  bias  in  residuals,  which  may  propagate 
as  bias  in  parameter  estimates.  True  precision  of 
experimental  data  is  not  known.  The  net  result  is  that 
error  distributions  are  usually  much  longer  tailed  than 
Gaussian,  and  error  subsets  may  be  highly  correlated. 
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In  recent  years,  there  have  been  a  number  of 
studies  of  variations  of  the  traditional  methods  of 
fitting  models  to  data,  to  develop  techniques  that  are 
more  robust/resistant  than  least  squares.  Tukey 
(1974)  has  described  the  properties  such  a  method 
should  have.  The  weakness  of  least  squares  lies  in  its 
great  sensitivity  to  the  occurrence  of  data  points 
widely  deviating  from  their  population  means  with  a 
frequency  greatly  exceeding  that  predicted  by  a 
Gaussian  distribution.  A  robust/resistant  technique 
treats  the  body  of  data  in  a  manner  similar  to  least 
squares.  Wild  data  are  ignored,  with  a  smooth  transi¬ 
tion  of  treatment  for  intermediate  situations  between 
these  extremes. 

In  this  article,  we  present  a  study  of  the  application 
of  robust/resistant  techniques  to  crystal  structure 
refinement,  using  the  data  taken  on  crystals  of  (/)+)— 
tartaric  acid  for  the  Single  Crystal  Intensity  Measure¬ 
ment  Project  of  the  International  Union  of  Crystallo¬ 
graphy  (Abrahams,  Hamilton,  &  Mathieson,  1970). 
Our  purpose  for  this  study  is  two-fold:  first,  to  use 
crystal-structure  refinement  as  a  nontrivial  test  of  the 
robust/resistant  approach;  and  second,  to  improve 
crystallographic  refinement  techniques. 


Robust/resistant  refinement 


Let  yt  =  |  Fobs  (ht)  ( (i  =  1,  2,  ...,  N)  be  a  set  of  N 
experimentally  determined  structure  amplitudes, 
where  h(  is  a  reciprocal  lattice  vector.  We  consider 
the  problem  of  fitting  the  usual  model 


w*(0)  =  I  Fcalc(h<’  6)  I 


where 

W1"  e)=  Q  Ei  ^/jexp(2i7/hjTj+hfj8jbi).(l) 

j 
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The  sum  is  taken  over  the  set  of  atoms  in  the  asym¬ 
metric  unit,  Q  is  a  scale  factor,  Et  an  extinction  factor 
and  I  —  V  —  1.  For  the y'-th  atom,/)  is  the  atomic 
scattering  factor,  rj  is  the  position  vector,  and  is 
the  thermal  vibration  tensor.  0  is  a  /7-dimensional 
vector  of  unknown  parameters,  including  scale  factor, 
extinction  parameter,  atom  position  coordinates,  and 
atom  thermal  vibration  parameters.  More  complex 
models  with  multiple  scaling  and  extinction  para¬ 
meters,  etc.,  are  easy  extensions  and  can  be  handled 
in  a  fashion  similar  to  the  discussion  here. 

A  conventional  ‘full  matrix’  weighted  least-squares 
refinement  fits  structure  factors  to  model  by  selecting 
0  to  minimize 


N 

2  <•?(«),  (2) 

i=  1 

where  r,  (0)  —  Vwt  [>»(  —  m,(0)]  is  the  ith  standardized 
residual.  Here,  wt  is  a  weighting  factor  which  reflects 
the  precision  of  the  measured  structure  factor  ampli¬ 
tude  y{.  Statistical  theory  suggests  that  w,  =  1/crf, 
where  of  is  the  variance  of  yt.  In  practice,  of  is  un¬ 
known  and  must  be  estimated.  A  common  estimate 
among  crystallographers  is 

-  1  IK  +  (bytYl,  (3) 

where  oft  is  an  estimate  of  the  variance  due  to  count¬ 
ing  statistics  and  b  is  a  constant  chosen  to  reflect  the 
variability  of  symmetry  equivalent  strong  reflections. 
Crystallographers  commonly  use  ‘weighted  R\ 

wR,  to  measure  the  goodness  of  fit  of  a  refinement 

/\ 

solution  0.  Specifically, 

wR  =  [  J  ri2(6)  /  J  Wi  rtf'* 

i= 1  i=\ 


(4) 
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Weighted  R  has  the  same  numerator  as  the  residual 
standard  deviation  estimate  of  scale, 

SD  =  [  ^  <f(9)  /  (AT  -  p)]?'2  (5) 

1=1 

Suppose  that  the  weight  used  in  the  refinement  is 
obtained  by  neglecting  counting  statistics  in  (3), 
giving  a  ‘constant  relative  variance  weight’.  Suppose 
further  that  this  weight  correctly  assays  precision 
except,  possibly,  that  the  relative  variance  factor  b 2 
may  be  wrong.  Let  b l  be  the  correct  factor.  Then  for 
large  N  wR  estimates  Vl  —  p/N  b0  while  SD  esti¬ 
mates  bjb.  Thus,  for  situations  where  relative  variance 
error  in  (3)  is  dominant  and  b  is  at  least  approximately 
correct  wR  estimates  relative  precision  and  SD 
estimates  unity. 

In  a  full-matrix  refinement,  multiple  equivalent 
reflections  and  duplicate  reflections  are  often  handled 
by  first  averaging  to  get  a  single  value  and  then 
modifying  the  weight  to  reflect  the  precision  of  a 
weighted  average  where  individual  weights  reflect 
the  relative  precision  of  the  individuals  making  up 
that  average.  For  simplicity,  in  the  remainder  of  the 
discussion,  the  term  ‘equivalent’  is  used  both  for  a 
true  equivalent  reflection,  where  crystal  symmetry 
gives  the  same  model  for  several  distinct  incident 
beam  directions,  and  for  duplicate  reflections  where 
repeated  measurements  are  made  for  the  same  beam 
orientation. 

When  a  robust/resistant  algorithm  is  used  to  per¬ 
form  the  refinement,  all  multiple  equivalent  reflec¬ 
tions  should  be  included  as  individual  observations. 
This  allows  the  fitting  algorithm  to  downweight 
individual  reflections  which  are  discrepant.  Now  the 
measures  of  agreement,  wR  and  SD,  should  calculate 
goodness  of  fit  using  the  difference  between  the 
weighted  average  of  equivalent  reflections  and  the 
estimated  model.  That  is,  the  variability  of  equivalent 
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reflections  about  their  weighted  average  should  be 
removed  from  those  measures. 

Consider  a  set  of  Nt  equivalent  reflections  yu 

where  j  =  1,  2,  Nt.  The  fitted  model  is  mt  (6). 
The  contribution  of  these  reflections  to  the  weighted 
sum  of  squares  of  residuals  can  be  partitioned  for¬ 
mally  as  follows 

2  ru  A  =  wi  ( yi  -  mi  A)2  +  J  wu  ( yu—yi )2,  (6) 

J  i 

where 

Nt  Nt 

wtJ  and  yt  =  J  wu  TcM- 
7=1  7=1 

The  first  term  in  (6)  is  the  measure  of  agreement  and 
the  second  term  is  the  variability  of  the  individual 
observations  about  their  weighted  average.  There  is 
no  analogous  partition  formula  for  the  denominator 
of  wR.  However,  wR  type  functions  measuring  the 
agreement  of  the  weighted  average  of  equivalent 
reflections  and  the  internal  variability  among  reflec¬ 
tions  can  be  calculated  and  compared  with  the  usual 
agreement  measure  using  all  reflections.  The  specific 
formulas  for  wR  and  SD  are  listed  in  the  top  two 
rows  of  Table  1.  Discussion  of  the  robust/resistant 
agreement  measure  SH  in  Table  1  is  deferred  until  all 
needed  notation  is  defined. 

As  remarked  above,  weighted  least-squares  refine¬ 
ment  can  give  very  poor  estimates  if  the  error  struc¬ 
ture  is  not  Gaussian.  Sets  of  structure  factors  are  not 
uniform  in  precision.  Weak  reflections  are  likely  to 
have  major  errors.  Instrumentation  effects  can  intro¬ 
duce  orientation  and  absorption  biases  not  included 
in  the  model  m,  (0).  Crystallographers  take  care  of 
discrepant  observations  by  screening  data  prior  to 
the  full  matrix  refinement  calculation.  Computer 
programs  (for  example,  Finger  and  Prince,  1975) 
include  quality  control  steps  for  the  rejection  of 


Table  1.  Measures  of  agreement  with  multiple  equivalent  and/or  duplicate  reflections 

Agreement  Internal  variability 
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structure  factors  which  do  not  fit  the  model.  However, 
there  are  problems  with  this  crystallographer  con¬ 
trolled  screening  approach.  Fitting  by  non-linear 
weighted  least  squares  to  a  model  with  hundreds  of 
parameters  is  very  complex.  Outlier  data  can  only  be 
screened  if  residuals  are  excessive.  In  such  a  complex 
fitting  problem  the  weighted  least-squared  algorithm 
makes  minor  adjustments  in  parameters  to  fit  discre¬ 
pant  data.  Such  data  may  not  be  detectable  by  a 
residual  test. 

Most  weighted  least-squares  refinement  computer 
programs  can  easily  be  modified  to  be  more  robust/ 
resistant.  Both  weighted  least  squares  and  the 
modification  are  examples  of  a  class  of  estimation 
methods  which,  for  crystal  structure  refinement,  take 
the  form,  minimize  the  loss  function 

N 

/(e)  =  J  f  lr>  (7) 

/=1 

by  selecting  0  so  that  the  gradient,  y/>  vanishes. 
Here  s  is  a  resistant  estimate  of  measurement  error 
scale  or  uncertainty  based  on  residuals.  The  defining 
equations  for  the  vanishing  of  y/  (except  for  a  cons¬ 
tant)  can  be  written  as 

?de)  =  y  *  (e)M  < 2  r,  (e)  =  o, 

ddj  Zw  Sty 

j=l,2,...,/?,  (8) 

where  </>(x)  =  (1/x)  dp(x)/dx.  Weighted  least  squares 
is  the  special  case  p(x )  =  x2/2.  Then,  </>(x)  =  1  and 
the  effect  of  the  residual  rt  (0)  on  the  solution  to  (8) 
is  proportional  to  residual  magnitude — an  unresistant 
situation. 

A  robust/resistant  alternative  to  weighted  least 
squares  has  </>(x)  near  to  1  for  x  close  to  zero,  decreas¬ 
ing  toward  zero  or  possibly  exactly  zero  for  |  x  | 
large,  and  a  smooth  transition  in  between.  Andrews 
(1974)  and  Beaton  &  Tukey  (1974)  illustrate  the 
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robust/resistant  approach  for  simple  linear  regression 
situations.  The  specific  estimates  are  not  highly  depen¬ 
dent  on  the  exact  shape  of  <f>(x).  Here  we  use  the 
Tukey  (1974)  ‘biweight’  function 

<f>(x)  =  [1  —  (x/a)2]2  for  |  x  |  <  a; 

=  0  otherwise.  (9) 

This  corresponds  to  a  loss  function 

p{x)  =  (x2/6)  [1  +  <f>lf 2  (x)  +  <f>(x)],  for  |  x  |  <  a; 

=  ,  for  |  x  |  >  a. 

(10) 

The  shape  of  this  function  is  shown  in  Fig.  1  and 
compared  with  the  parabola  p(x)  —  x2/2  and  also 
with  the  function  p(x)  =  c  In  [1  +  (x/6)2],  which  is  a 


Gaussian  standard  deviation  units 


Fig.  1.  Least  squares  compared  to  two  alternative  loss  func¬ 
tions. 
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loss  function  that  would  lead  to  maximum  likelihood 
estimates  if  the  errors  had  a  Cauchy  distribution. 
The  function  of  (10)  lies  close  to  the  least-squares 
function  for  small  x,  but  is  constant  for  large  x,  so  that 
large  deviations  do  not  influence  the  solution  of  (8). 
(Because  the  function  </>(x)  appearing  in  the  algorithm 
for  finding  the  minimum  is  a  factor  multiplying  the 
weight,  this  technique  is  sometimes  called  ‘iteratively 
reweighted  least  squares’.  While  this  term  is  sugges¬ 
tive,  it  should  be  clearly  understood  that  the  function 
being  minimized  is  quite  distinct  from  a  sum  of 
squares.) 

For  comparative  purposes  in  Fig.  1,  b  —  0-6745  so 
the  Gaussian  and  Cauchy  error  distributions  have 
the  same  probable  error.  Also,  c  —  b2l2  so  the  three 
functions  have  unit  curvature  at  the  origin  and,  hence, 
weight  reflections  with  small  residuals  identically. 
Based  on  probable  error  normalization  the  biweight 
type  loss  function  (10)  places  much  more  weight  on 
large  residuals  than  does  the  Cauchy-type  function. 
This  is  reasonable  since  error  distributions  as  long¬ 
tailed  as  a  Cauchy  distribution  are  rarely,  if  ever, 
encountered  in  real  experimental  data. 

The  crystal  structure  model  (0)  is  a  non-linear 
function  of  0.  With^(x)  =  1  in  (8),  iterative  approach¬ 
es  have  been  used  to  obtain  the  solution.  Busing, 
Martin  &  Levy  (1962)  and  Finger  and  Prince  (1975) 
describe  algorithms  and  computer  programs  for 
iterative  solution  of  the  weighted  least-squares  equa¬ 
tions.  In  general,  such  algorithms  include: 

(a)  A  linearized  form  of  (8)  with  <f>(x )  =  1  for 
calculating  Qq+1  from  a  previous  solution  09; 

(b)  A  stopping  rule  for  deciding  when  the  iterative 

procedure  has  converged  to  an  acceptable 

/\ 

solution  0;  and 

(c)  A  procedure  for  calculating  the  precision  of  the 
estimate  0. 
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Newton  linearization  of  (8)  gives 


p  N 
1 

k=\  /=! 


A0j+1  =  ^  CJk  ^  <f>  [rt  ( Qq)ls 9]  wfi2  rt  (Q9) 

dmt  (e9) 


X 


aefc 


(ii) 


where  CJk  is  the  generic  element  from  the  inverse  of 
the  Hessian  matrix  of /(0).  Specifically, 


i=l 


V  .  ,  . .  c™<  (6a)  dm,  (0«) 

cJt = 2,  w‘  “tr< (e  )/s  i 

a2wt  (©") 

rt  (0«J 

i=i 


d2mi  (0") 

2^  ^1/2  *[r,  (09)/*9]  rt  m  (12) 


In  (12)  oj(x)  —  (dldx)(x<f>(x))  =  (d2ldx2)p(x).  In 
crystallography,  it  is  customary  to  replace  the  struc¬ 
ture  factor  expression  (1)  by  a  linear  Taylor  expansion. 
Hence,  the  second  partial  term  does  not  appear  in 
(12).  Following  Huber  (1973),  we  simplify  the  first 
sum  by  replacing  each  w  factor  by  the  average  over 
all  N  observations.  The  result  is  the  simplified  version 


where 


Cjfc  — 


u>  dmt  (0<9  dmt  (0«) 

W  i -  - 

ddj  dOk  ' 


(13) 


N 

*  =  (1  /N)  J  «[r,  (09)/*9]. 
/=  1 


In  (13),  to  plays  the  role  of  a  variance  efficiency  factor 
with  respect  to  Gaussian  error  structure.  That  is,  the 
parameter  estimates  have  variances  of  the  same  order 
as  they  would  have  if  the  error  distribution  were 
Gaussian  and  there  were  coN  reflections  in  the  data 
set. 

In  (1 1),  the  width  parameter  s9  is  a  resistant  estimate 
constructed  from  residuals.  Andrews  (1974),  Tukey 
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(1974),  and  Welsch  &  Kuh  (1977)  are  examples  of 
those  suggesting  =  MAD9/0-675  where  MAD9  is 
the  Median  Absolute  Deviation;  i.e.  the  median  of 
the  absolute  residuals,  |ri(09)|. 

Huber  (1973)  suggests  ^9+1  =  (a9//J)1/2  as  a  resistant 
estimate  of  scale.  Here, 


N 

a«  =  J{^[/-i(e9)/^]}Vf(09)/(A-p),  (14) 

1=1 

and  |8  is  the  expected  value  of  [Z</>(Z)]2  with  Z  aistri- 

buted  according  to  the  true  error  law.  If  the  error  law 

is  Gaussian  and  the  biweight  function  (9)  cut-off  is 

a= 6,  a/3=072767  makes  a9//3  an  unbiased  estimate 

of  the  variance  of  the  distribution.  For  a  longer  tailed 

error  law,  is  smaller,  but  the  practical  application  of 

the  method  is  not  highly  dependent  on  the  choice  of 

|8.  With  multiple  equivalent  reflections,  (14)  must  be 

modified  in  a  manner  similar  to  that  for  weighted  least 

squares.  Calculationally,  in  determining  the  solution 
/\ 

0,  the  product  term  4>w  is  just  a  least  squares  type 
weight.  In  the  normal  equation  (8),  the  terms  involving 
an  equivalent  set  of  reflections ytJ  (J—l,2...,Ni)  can 
be  rewritten  as 

Nt 

V'  <9w,(0) 

2^  ^[/,u(e)/j]Wif  >7/0)  ,  a=l,2 

7=1 


N 

=  t>i-Wi(e)]  (15) 

7=1 


=  <h  w}/ar!(0) 


dnti(Q) 
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where 


Ni  Nt 

<f>t  =  J  {^[/,w(e)/j]  ¥wu  /  2  ^Me)/*] 

7=1  7=1 


W 


ij  > 


r  ((e)  =  ^/2  ; 

wt  =  2  {^M0)/5]^}2  /  2  {^^7(0)/^} 2  ^7; 

7=1  7=1 


and 


N,  TV* 

y=i  7=1 


Thus,  as  for  the  usual  weighted  least  squares  calcu¬ 
lation  with  duplicate  observations,  a  structure  factor 
robust/ resistant  fit  depends  upon  an  equivalent  set  of 
reflections  only  through  a  weighted  average  yt. 
Assuming  that  l/w0-  is  the  variance  of  yu,  then  1  jwt 
of  (15)  is  the  variance  of  yt.  Now  ^  plays  the  role  of 
the  biweight  for  wj/2  yt.  The  formula  (15)  has  the 
same  form  as  the  normal  equation  formula  (8),  i.e . 
a  quadruple  product  of  a  weight  adjustment  factor, 
the  reciprocal  of  a  standard  deviation,  a  residual,  and 
a  partial  derivative.  Applying  an  asymptotic  argument 
due  to  Huber  (1973),  the  contribution  to  the  measure 
of  agreement  (14)  simplifies  to 


4>z[?mis«]z  = 


1 2  W'oCe1 1)!sq]ywu 


(16) 


Summation  over  i  in  (16)  gives  the  agreement  term  in 
row  (SH)2  of  Table  1.  The  internal  variability  term 
applies  Huber’s  argument  to  estimating  duplicate 
error. 

Equation  (11),  with  s9+1  defined  by  MAD  or  by  (14), 
gives  an  iterative  procedure  for  solving  the.  sytem  (8). 
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A  computer  algorithm  for  solving  the  weighted  least- 
•quares  refinement  problem  can  be  changed  into  a 
robust/resistant  algorithm  by  the  inclusion  of 
<f>[ri(Qq)ls9]  in  (11),  the  inclusion  of  c o  in  (13),  and  an 
extra  equation  defining  sq.  For  the  biweight  function 
(9),  <v(x)—5<p(x)—4<f>1,2(x).  Thus  To  is  easily  calculated 
from  <f>[r(Oq)ls9]  values.  Our  experience  suggests  that, 
with  either  of  the  above  procedures  for  estimating 
width,  a  value  of  the  constant  a  in  (9)  of  about  6 
screens  out  extreme  outliers  while  having  a  minor 
effect  on  the  body  of  the  data. 

For  best  results,  a  robust/resistant  iterative  regres¬ 
sion  calculation  should  start  from  a  resistant  estimate 
of  0.  For  simple  linear  regression  situations,  Andrews 
(1974)  suggests  a  median  procedure  for  getting  the 
initial  0'  estimate.  The  complexity  of  the  crystal 
structure  model  suggests  that  a  more  practical 
approach  is  to  start  with  whatever  estimate  is  available, 
and  accept  a  penalty  of  more  iterations.  The  stopping 
rule  should  not  be  based  on  a  fixed  number  of 
iterations.  We  have  observed  that  for  some  data  sets 
convergence  is  much  faster  than  with  a  conventional 
weighted  least-squares  algorithm  and  for  some  much 
slower.  We  use  a  stopping  rule  of  nominal  maximum 
parameter  adjustment  measured  in  standard  deviation 
units. 

A  number  of  approaches  are  available  for  estimat¬ 
ing  the  standard  deviations  of  parameter  estimates, 


0,  based  on  asymptotic  expansion  arguments;  i.e. 
N,  p  and  N-p  must  all  be  large.  Some  approaches 
(see  Mallows  (1973)  and  Welch  (1975))  estimate  the 


covariance  of  0  with  a  scalar  multiple  of  the  inverse 
of  a  matrix  with  jk- th  element 


1  =  1 


These  approaches  do  the  estimation  by  using  weights 


<t>[rt(Q)ls]wi  in  a  standard  weighted  least-squares 
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program.  Here,  the  dependence  of  <f>[rt(Q)ls]  on  0  is 
ignored  in  the  expansion  (11).  The  basic  assumption  is 

A 

that  lA&Me)/^,  not  1  jwt,  is  proportional  to  the 
variance  of  yt.  Our  approach  assumes  that  all  the 
weighted  structure  factor  amplitudes,  w\12  yu  have 
unit  variance.  They  are  random  observations  from  a 
long  tailed  error  distribution,  so  that  there  is  high 

probability  that  a  few  structure  factors  will  have 

/\ 

extreme  errors.  We  use  <f>[ri(Qls)]  as  a  calculational 
convenience  to  reduce  the  influence  of  the  extreme 
data.  Thus,  we  follow  Huber  (1973)  and  use  a  scalar 
multiple  of  C-1,  where  C—{CJk }  is  defined  by  (13),  to 

A 

estimate  the  covariance  matrix  of  0.  Specifically,  our 
variance  estimate  of  0  is 

S\  =  mSH  fi  CJJ.  (17) 

Oj 

Here,  (SH)2  is  defined  in  Table  1.  CJJ  is  the  y-th 
diagonal  element  of  C-1.  K  is  a  bias  correction  factor 
defined  as 

K=l+p(l-w)INw.  (18) 

Huber’s  development  of  (17)  assumes  that  the  error 
distribution  and  the  loss  function  p  are  symmetric,  and 
that  all  the  diagonal  elements  of  the  C  matrix  (Cit 
of  (13))  are  identical.  The  first  two  assumptions  are 
reasonable  for  crystal  structure  refinement.  The  last 
is  never  satisfied.  Hence  (17)  is  an  approximation 
that  needs  empirical  verification. 


Application  to  D(+)-tartaric  acid 

The  Single-Crystal  Intensity  Measurement  Project 
sponsored  by  the  Commission  on  Crystallographic 
Apparatus  of  the  International  Union  of  Crystallo¬ 
graphy  (Abrahams,  Hamilton,  &  Mathieson,  1970, 
hereinafter  referred  to  as  AHM)  was  aimed  at  deter- 
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mining  the  level  of  consistency  that  could  be  obtained 
in  the  collection  of  X-ray  diffraction  intensity  data 
from  single  crystals  of  one  compound  in  different 
laboratories.  Crystals  of  D(+)-tartaric  acid  were 
grown  in  one  laboratory  and  distributed  to  16  labo¬ 
ratories  in  eight  countries.  These  laboratories  in  turn 
collected  data  sets  from  the  crystals  they  received, 
using  their  established  techniques.  The  data  sets  were 
then  returned  to  a  single  laboratory  and  compared  by 
statistical  methods,  and  the  analysis  showed  that  there 
were  some  fairly  substantial  differences  among  the 
various  data  sets. 

In  an  attempt  to  determine  what  effect  these  dif¬ 
ferences  would  have  on  the  results  of  a  structure 
refinement,  Hamilton  &  Abrahams  (1970,  hereinafter 
referred  to  as  HA)  refined  the  structure  of  £>(+)- 
tartaric  acid  using  10  of  the  17  data  sets.  (One  labo¬ 
ratory  submitted  two  data  sets.  Three  sets  had  in¬ 
sufficient  data  to  refine,  and  in  four  the  least-squares 
procedure  failed  to  converge.)  Again,  there  were  some 
substantial  differences,  from  one  data  set  to  another, 
in  the  parameter  estimates  obtained  from  the  refine¬ 
ment  procedure.  Subsequently.  Mackenzie  (1974) 
examined  the  nature  of  systematic  differences  among 
the  various  data  sets.  He  concluded  that  there  was  a 
tendency  for  the  differences  to  be  greatest  for  the 
largest  magnitude  structure  factors,  suggesting  that 
an  important  difference  among  the  experiments  was 
the  amount  of  secondary  extinction  present  in  the 
particular  crystal  used  in  each  experiment. 

In  our  study  we  have  applied  the  robust/resistant 
method  described  in  the  previous  section  to  12  of  the 
Single-Crystal  Intensity  Measurement  Project  data 
sets  in  order  to  determine  if  the  new  approach  would 
reveal  more  about  the  nature  of  the  biases  in  the 
experiments  and  resolve  some  of  the  discrepancies 
in  the  results.  In  the  following  discussion,  we  use  the 
HA  numbers  to  identify  experiments.  The  experiments 
selected  for  reanalysis  include  2,  4,  5,  7,  8,  9,  11a,  12, 
13, 14, 15  and  16.  Our  twelve  include  the  ten  refined  by 
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HA  plus  experiments  12  and  14,  for  which  their  refine¬ 
ments  diverged.  The  full  set  of  structure  factors 
consist  of  every  hkO  reflection,  including  equivalent 
reflections,  within  the  range  (sin  6)1  A  <  0-5  A-1,  and 
all  reflections  with  positive  k  and  /  within  the  same 
(sin  6)1  A  range,  except  in  experiment  14  where  hkl 
reflections  were  replaced  by  hkl  equivalent  ones.  The 
full  set  includes  332  non-equivalent  reflections.  Only 
experiments  2,  14  and  16  measured  all  332.  We 
included  in  our  analysis  all  the  experiments  for  which 
at  least  232  non-equivalent  reflections  were  measured. 

The  refinement  for  each  data  set  had  three  stages: 
(1)  an  attempt  to  ‘recreate’  the  results  of  HA  by 
repeating  as  closely  as  we  could  the  conditions  of  their 
refinements;  (2)  a  refinement  including  secondary 
extinction;  and  finally  (3)  a  refinement  using  the 
robust/resistant  procedure  described  above. 

The  D(+)-tartaric  acid  structure  belongs  to  space 
group  P2t,  and  the  unit  cell  contains  32  atoms-12 
oxygen,  8  carbons,  and  12  hydrogens.  Therefore  the 
positions  and  thermal  parameters  for  16  atoms  must 
be  refined  in  the  complete  model.  (Any  single  y- 
coordinate  must  be  fixed  to  define  the  origin.) 

The  computer  program  RFINE4  (Finger  and  Prince, 
1975)  was  modified  as  described  in  the  previous  section 
to  perform  the  robust/resistant  fitting  of  the  model 
(1)  using  the  Tukey  biweight  function  (9).  In  the 
complete  model,  including  extinction,  there  are  115 
parameters  to  be  refined.  These  parameters  include 
47  position  parameters,  66  temperature  parameters, 
a  scale  factor,  and  an  extinction  parameter.  The 
extinction  parameter,  r*  (Zachariasen,  1968)  has  the 
form 

nalc^^calcH+W^alc^]-174;  (19) 

where 

0(0)  =  {[2e4A2  (lfi-cos4  26)]/[m2c4V2  (1+cos2  26)}  T. 

A  is  the  wavelength,  V  is  the  unit  cell  volume,  e  and  m 
are  the  charge  and  mass  of  the  electron,  and  c  is  the 
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velocity  of  light.  In  the  absence  of  absorption  infor¬ 
mation  for  determining  the  pathlength  parameter  T, 
this  quantity  was  treated  as  a  constant,  and  the  refined 
extinction  parameter  corresponds  to  the  product  Tr*. 

In  the  model  (1),  the  scattering  factors  fj  are  deter¬ 
mined  in  the  manner  of  Cromer  and  Mann  (1968). 
RFINE4  uses  an  analytic  function  with  coefficients 
chosen  to  fit  smooth  curves  through  the  tabulated 
points  (International  Tables  for  X-ray  Crystallogra¬ 
phy,  1962). 

The  constant  relative  variance  weight  (3)  with 
o%i= 0  and  6=01  was  used  by  HA  in  their  refinements. 
Since  we  wished  to  reproduce  their  results  as  closely 
as  possible,  the  same  weight  function  was  used.  A 
more  general  weight  function  of  the  form  (3)  with 
cr^  >0  would  improve  the  quality  of  the  refinements  by 
a  better  approximation  of  the  random  errors  in  weak 
reflections.  However,  since  all  our  refinements  use 
the  same  weights,  we  have  a  valid,  though  not  neces¬ 
sarily  optimum,  comparison  of  classical  and  robust/ 
resistant  approaches  to  crystal  structure  refinement. 

In  the  first  refinement  of  each  experiment  we  used 
the  same  initial  parameters  for  position  and  thermal 
vibration  as  did  HA.  The  heavy  atom  parameters  were 
those  of  Okaya,  Stemple  &  Kay  (1966).  The  hydrogen 
atom  parameters  were  preliminary  neutron  diffraction 
results  of  Cox,  Sabine  &  Taylor  (1966).  For  each 
experimental  data  set,  preliminary  iterations  were  done 
by  refining  only  the  scale  factor.  This  was  followed  by 
iterations  with  the  scale  factor  and  the  heavy  atom 
parameters  being  refined  until  convergence.  Finally, 
iteration  with  refinement  of  all  parameters  was 
continued  until  convergence.  In  each  case,  convergence 
was  defined  as  maximum  j  Ad  |  /S  (0)<OTO.  To  see 
whether  the  starting  point  would  influence  the 
solution,  some  refinements  were  repeated  beginning 
with  the  final  results  of  HA.  In  all  cases,  the  two 
refinements  converged  to  the  same  solution,  which 
differed  slightly,  for  reasons  that  are  not  clear,  from 
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those  of  HA.  In  our  refinements  to  recreate  the  HA 
results,  we  used  all  of  the  data  reported  by  each 
experimenter  which  satisfied  sin  0/A  <  0-5 A.  The  only 
screening  was  by  RFINE4,  which  rejects  discrepant 
reflections  based  on  |  r,  (0)  |  >  constant.  (In  our 
case  equal  to  2.)  Table  2  classifies  the  sets  of 
reflections  for  each  experiment  into  non-equivalent, 
additional  equivalent,  and  total  number  of  reflections. 
The  last  column  is  the  number  of  reflections  used  by 
HA  in  their  refinements.  Experiment  5  included  three 
reflections,  100,  002,  and  020,  which  were  duplicated 
37  times  each.  We  used  two  measurements  of  each 
reflection  in  our  refinement. 

The  results  of  the  recreated  refinement  were  used  as 
the  initial  parameter  estimates  in  the  second  refine¬ 
ment.  Experiment  11a  did  not  refine  to  convergence, 
as  several  hydrogen  atom  parameters  continued  to 
oscillate.  The  iteration  with  smallest  SD  was  used  to 
start  the  extinction  refinement.  Initially,  refinement 
was  restricted  to  scale  and  extinction  factors.  After 
reasonable  convergence,  iteration  continued  with 
refinement  of  the  full  model.  All  experiments  except 

Table  2.  Structure  factor  sets  satisfying  IUC  guides 
and  Hamilton  and  Abrahams  Total 


Experiment 

Number 

Non- 

Equivalent 

Equivalent 

Total 

HA 

2 

332 

35 

367 

368 

4 

331 

0 

331 

331 

5 

327 

178 

605 

607 

7 

331 

101 

432 

429 

8 

320 

35 

355 

355 

9 

303 

65 

368 

366 

11a 

251 

88 

339 

342 

12 

303 

99 

402 

403 

13 

326 

92 

418 

417 

14 

332 

0 

332 

273 

13 

234 

88 

322 

324 

16 

332 

96 

428 

429 
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11a  and  12  converged.  Experiment  11a  diverged,  while 
in  experiment  12  some  hydrogen  parameter  estimates 
oscillated. 

The  results  of  the  extinction  refinement  were  used 
as  initial  parameters  in  the  robust/resistant  refinements. 
For  all  experiments,  (except  experiment  (5)  at  least 
ten  iterations  were  needed  to  bring  solid  convergence. 
As  the  iterations  progressed,  there  was  a  gradual 
change  in  the  residual  configuration,  a  tightening  of 
the  majority  of  the  residuals,  and  a  migration  away 
from  the  body  of  the  data  by  a  small  fraction.  The 
lack-of-convergence  problem  evident  in  earlier  refine¬ 
ments  did  not  carry  over  to  the  robust/resistant  refine¬ 
ment. 


Discussion  of  results 

RFINE4  constructs  the  usual  standard  deviation 
estimates  based  on  the  Hessian  matrix  of  second 
partial  derivatives  and  the  residual  standard  deviation 
estimate  (5).  Program  output  includes  the  change 
from  interation  to  iteration  of  all  parameters, 
measured  in  standard  deviation  units.  In  all  cases  the 
final  stopping  rule  was  that  the  maximum  change  in 
parameters  was  not  more  than  10%  of  the  standard 
deviation.  Clearly,  this  rule  is  conservative,  particularly 
for  robust/resistant  refinements  where  standard 
deviations  are  in  general  underestimated  by  a  least 
squares  type  algorithm. 

Table  3  summarizes  the  data  set  sizes  and  scale 
measure  comparisons  between  our  recreated  refine¬ 
ments  and  the  HA  refinements.  For  the  recreated 
refinements,  the  formulas  in  Table  1  were  used  to 
calculate  weighted  R(wR )  and  standard  deviation  (SD) 
estimates  of  scale.  The  agreement  columns  (A)  are 
measures  of  fit  between  the  full  115  parameter  model 
(1)  and  the  data  set  of  non-equivalent  reflections 
(some  being  weighted  averages  of  the  equivalent 
reflections  for  a  single  triple  of  Miller  indices).  The 


Table  3.  Comparison  of  data  set  size  and  scale  measures  for  recreated  and  Hamilton  and  Abrahams  refinements 
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x)  Oscillatory  behaviour.  Data  for  minimum  total  SD  value  reported. 
2)  Diverged. 
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size  of  the  data  set  involved  in  the  fit  is  115+  degrees 
of  freedom.  After  outlier  screening,  the  smallest 
data  set  for  fitting  is  the  229  of  experiment  1 5. 

The  internal  variability  columns  (IV)  are  measures 
of  scale  computed  from  the  differences  between 
equivalent  reflections  totalled  over  all  sets  of  equi¬ 
valent  reflections.  For  experiments  4  and  14  only 
averages  of  equivalent  reflections  were  reported.  One 
degree  of  freedom  is  lost  for  each  set,  so  the  degrees 
of  freedom  (DF)  column  is  the  total  number  of  equi¬ 
valent  reflections  minus  the  number  of  distinct  sets. 
In  all  cases,  the  standard  deviation  (SD)  for  internal 
variability  is  smaller  than  that  for  agreement.  In  all 
but  two  cases,  the  weighted  R(wR)  for  internal 
variability  is  smaller  than  that  for  agreement.  This  is 
very  reasonable  since  internal  variability  measures 
duplication  error  and  experimental  asymmetry  with 
respect  to  equivalent  beam  positions,  while,  in 
addition,  agreement  measures  inadequacies  of  the 
crystal  structure  model. 

Among  all  experiments  (excluding  11a,  which  did 
not  converge)  experiment  5  stands  out  as  having  very 
little  variability  above  that  suggested  by  equivalent 
reflections.  That  is,  only  for  experiment  5  does  the 
model  appear  to  fit  the  data  adequately  relative  to  the 
scatter  of  equivalent  reflections. 

The  SD  total  column  (T)  is  the  average  of  agreement 
and  internal  variability  SD  values  weighted  according 
to  the  respective  numbers  of  degrees  of  freedom. 
This  is  often  called  the  ‘estimated  standard  deviation 
of  an  observation  of  unit  weight’  by  crystallographers. 

It  must  be  intermediate  between  the  agreement  SD 
and  the  internal  variability  SD.  If  the  agreement  SD 
is  larger,  the  total  SD  is  always  an  underestimate  of 
discrepancy  between  data  and  model  and,  hence, 
exaggerates  the  precision  of  the  structure  factor 
data.  There  is  no  such  order  relationship  among  the 
wR  values  because  the  total  is  not  a  weighted 
average.  HA  do  not  include  SD  values,  but  only 
present  totals  R  and  wR.  In  Table  3,  we  compare  the 
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two  refinements.  The  total  number  of  reflections  in  the 
data  set  is  the  sum  of  agreement  degrees  of  freedom, 
internal  variability  degrees  of  freedom,  and  the  number 
of  parameters,  115.  In  all  cases,  our  recreated  wR 
values  are  smaller,  sometimes  by  as  much  as  a  factor 
of  2.  In  all  our  refinements,  we  screened  out  extreme 
reflections  with  |  r,  |  >2.  Apparently  HA  did  not 
screen  the  data  as  extensively.  Their  data  sets 
(excluding  experiment  14)  are  larger.  We  cannot  say 
whether  our  refinement  is  generally  a  better  fit,  with 
uniformly  smaller  standardized  residuals  or  whether 
the  inclusion  of  a  few  discrepant  reflections  has 
artificially  increased  the  HA  wR  values. 

Table  4  lists  wR  and  SD  measures  of  agreement  for 
our  three  refinements,  recreation,  extinction,  and 
biweight.  The  tabulated  values  are  number  ( N ), 
weighted  R(wR),  and  standard  deviation  (SD), 
calculated  from  non-equivalent  reflections  by  the 
formulas  in  Table  1.  In  addition,  for  biweight,  the 
Huber  standard  deviation  (SH)  and  the  average 
effectiveness  of  an  observation  (w)  are  listed.  The 
three  N,  wR  and  SD  columns  for  the  recreated  refine¬ 
ment  are  reproduced  from  Table  3  for  easy  com¬ 
parison. 

We  first  note  that  there  was  one  experiment,  11a, 
for  which  the  recreated  refinement  did  not  converge. 
The  refinement  reached  a  point  of  oscillatory  behavi¬ 
our,  with  little  change  in  SD,  but  swings,  from  itera¬ 
tion  to  iteration,  of  several  standard  deviation  units 
for  two  of  the  thermal  vibration  hydrogen  atom  para¬ 
meters.  For  the  extinction  refinement  experiments  11a 
and  12  both  diverged.  A  smallest  standard  deviation 
of  agreement  (SD)  was  reached  for  which  there  were 
still  significant  shifts  in  the  parameters.  Then,  with 
additional  iterations  SD  diverged.  For  six  experiments 
the  extinction  refinement  used  more  observations  than 
the  recreated  ones.  A  closer  look  at  individual 
structure  factor  data  shows  that  the  strong  reflections, 
with  presumably  large  amounts  of  extinction,  were 
brought  in  line  with  the  body  of  the  data  and,  hence, 
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satisfied  the  cut-off  rule  of  |  rt  |<2.  For  the  robust/ 
resistant  refinements  all  the  twelve  experiments 
converged  solidly.  The  maximum  parameter  shift  in 
the  final  iteration,  measured  in  standard  deviation 
units,  was  of  the  order  of  a  percent  or  two.  Thus,  not 
only  did  the  robust/resistant  algorithm  force  conver¬ 
gence  for  the  two  experiments  which  previously  did 
not  converge,  it  also  brought  solid  convergence  for 
other  experiments  which  were  still  having  parameter 
shifts  of  5  to  10%  of  their  standard  deviations. 

The  number  of  observations  column  (TV)  for  the 
robust/resistant  refinements  gives  the  number  of 
reflections  which  received  positive  weight.  Since  the 
biweight  function  decreases  monotonically  to  zero, 
there  are  a  number  of  these  reflections  which  receive 
very  low  weight.  The  average  effectiveness  column  (co) 
suggests  that  the  refinements  have  the  precision  of 
ones  with  about  15%  of  the  reflections  eliminated. 
Among  those  that  refined  to  convergence  the  experi¬ 
ments  with  major  extinction  corrections  are  2,  4,  7,  8 
and  14.  These  experiments  have  the  largest  Tr*  extinc¬ 
tion  parameter  estimates,  use  more  reflections  in  the 
extinction  refinement  than  in  the  recreated  refinement, 
and  have  a  significant  reduction  in  the  standard 
deviation  of  agreement.  Mackenzie’s  (1974)  raw  data 
residual  analysis  clearly  picks  out  experiments  2  and 
4  as  having  major  extinction,  but  the  status  of  the 
other  three  is  unclear.  Thus  the  inclusion  of  the 
extinction  factor  for  these  experiments  brings  the 
strongly  extinguished  reflections  in  line  with  the 
model,  allows  them  to  be  included  in  the  refinement 
and,  at  the  same  time  reduces  the  standard  deviation 
of  agreement. 

There  is  a  significant  positive  correlation  between 
the  degree  of  extinction,  as  measured  by  the  fitted 
parameter  Tr*  of  (19),  and  the  crystal  volume  reported 
by  experimenters  in  AHM.  This  is  what  would  be 
expected  because  the  values  of  T  are  larger  in  larger 
crystals.  For  the  experiments  without  strong  extinc- 
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tion  there  is  very  little  difference  between  the  standard 
deviations  of  agreement  for  recreated  and  extinction 
refinements,  although  in  no  case  is  the  extinction  one 
larger.  There  is  no  strong  pattern  between  the  standard 
deviations  of  agreement  for  the  extinction  refinements 
and  the  Huber  values  for  the  robust/resistant  refine¬ 
ments.  In  general,  the  Huber  values  are  slightly 
smaller.  The  standard  deviation  of  agreement  values 
obtained  from  the  refinement  are  clearly  biased  low, 
as  pointed  out  above.  The  Huber  values  are  approxi¬ 
mately  10%  larger  and  appear  to  compensate  for  most 
of  that  bias,  but  still  may  be  slightly  low.  The  under¬ 
lying  error  distribution  is  not  Gaussian,  so  the  value 
in  the  Huber  variance  formula  is  too  large. 

If  there  are  significant  biases  in  the  individual 
parameter  estimates,  then  the  standard  deviation 
values  for  these  parameter  estimates  do  not  account 
for  the  differences  among  the  parameters  across 
experiments.  Hence  in  a  manner  similar  to  HA  we 
selected  a  model  group  of  experiments  which  are  free 
of  obvious  parameter  bias.  In  order  to  do  that  we 
looked  at  two  sets  of  parameters  across  the  extinction 
experiments — the  atom  position  parameters  and  the 
heavy  atom  diagonal  thermal  parameters.  For  each 
parameter  in  the  model,  the  twelve  parameter  estimates 
were  ranked  from  smallest  to  largest,  to  determine 
whether  any  of  the  experiments  showed  up  consistently 
as  having  the  extreme  parameters.  For  heavy  atom 
position  parameters  in  experiment  15,  six  of  the  ten 
x-coordinate  position  parameters  were  the  highest. 
In  addition  two  ^-coordinate  position  parameters 
were  the  highest  and  five  were  the  lowest.  Thus  one 
experiment  accounted  for  13  out  of  the  19  extreme  x 
and  y  coordinate  position  parameter  estimates. 

Using  a  similar  ranking  on  the  heavy  atom  diagonal 
thermal  parameters,  experiments  11a,  12  and  13 
account  for  all  but  one  of  the  largest  parameter 
estimates  and  17  out  of  the  29  smallest  parameter 
estimates.  Experiment  13  ZZ-coordinate  diagonal 
thermal  parameters  are  2  to  5  times  larger  than 
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those  for  the  rest  of  the  experiments.  Experiment 

12  has  8  of  the  largest  XX-diagonal  thermal 
parameters  and  9  of  the  smallest  ZZ-diagonal 
thermal  parameters.  Experiment  11a  has  8  of  the 
smallest  YY-diagonal  thermal  parameters  and  9  of 
the  YY-largest  diagonal  thermal  parameters.  Based 
on  the  result  of  this  simple  extreme  value  screening 
of  the  heavy  atom  parameters,  experiments  11a,  12, 

13  and  15  were  eliminated  from  further  consideration. 
Experiments  11a  and  12  could  just  as  well  have  been 
eliminated  because  of  convergence  problems.  Also, 
experiment  15  clearly  has  an  insufficient  member  of 
structure  factor  values  (228)  to  estimate  anisotropic 
thermal  vibration  parameters. 

HA  found  that  the  standard  deviations  of  parameter 
estimates  were  too  small  to  predict  the  spread  among 
parameters  across  the  individual  experiments.  The 
situation  was  particularly  bad  for  thermal  vibration 
parameters,  where  the  standard  deviations  were  small 
by  a  factor  of  about  four.  They  attributed  this  to  the 
fact  that  some  of  the  experiments  had  serious  syste¬ 
matic  errors,  and,  hence,  the  model  was  incorrect. 
Since  inclusion  of  extinction  removes  one  source  of 
systematic  error,  one  might  expect  that  the  standard 
deviations  would  come  closer  to  predicting  the  total 
scatter  in  parameter  estimates.  A  further  question  was 
whether  the  robust/resistant  approach  would  compare 
favourably  with  the  classical  extinction  model,  or 
possibly  improve  the  situation  by  down  weighting  a 
imall  fraction  of  the  structure  factor  estimates  which 
had  additional  serious  systematic  errors.  For  the 
recreated  and  extinction  refinements,  we  estimate  the 
standard  deviations  of  individual  parameter  estimates 
by  changing  the  multiplier  on  diagonal  elements  of 
the  inverse  of  the  Hessian  matrix  from  the  standard 
deviation  of  unit  weight  calculated  from  all  of  the 
residuals  to  the  standard  deviation  of  agreement  as 
given  in  Table  3.  For  the  robust/resistance  refinements, 
the  standard  deviations  of  parameter  estimates  are 
calculated  from  (17).  Using  these  standard  deviation 
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estimates,  a  chi-square  parameter  agreement  statistic 
was  calculated  across  the  eight  ‘good’  experiments 
using  the  formula. 


8 


i=l 


8 


8 


Here  x2  is  the  agreement  statistic  for  the  a-th  para¬ 
meter,  and  Pa  is  the  weighted  (by  the  reciprocal  of  the 
estimate  of  variance)  estimate  of  the  mean  of  the 
a-th  parameter  for  the  eight  ‘good’  experiments.  If  the 
total  variability  across  the  parameter  set  is  explained 
by  the  refinement  standard  deviations,  this  agreement 
statistic  is  approximately  distributed  as  a  chi-square 
variable  with  seven  degrees  of  freedom.  If  it  is  signi¬ 
ficantly  large,  then  either  the  precision  of  some  of  the 
estimates  is  less  than  that  indicated  by  the  refinement 
standard  deviations,  or  there  are  systematic  effects  in 
some  of  the  experiments  which  make  their  parameter 
estimates  outliers. 

Table  5  is  a  stem  and  leaf  display  of  the  chi-square 
agreement  statistics  for  the  113  individual  parameters 
(excluding  scale  and  extinction)  in  the  complete  16 
atom  model  (1).  For  ease  of  comparison  across  the 
three  methods  of  analysis  (recreated,  extinction  and 
robust/resistant)  the  chi-square  values  have  been 
symmetrized  using  a  logarithmic  transformation. 
Specifically,  if  x2  is  a  chi-square  variable  on  k  degrees 
of  freedom,  then  Y  —  (k/2)1/2  In (x2/&)  is  nearly 
symmetrically  distributed,  with  mean  and  variance 
near  0  and  1  respectively.  The  1st  and  99th  percentiles 
of  the  transformed  x2  distribution  with  7  degrees  of 
freedom  are  —  3-66  and  1-82  respectively.  Hence  the 
Y  values  should  fall  between  —  4  and  +  2  if  there 
are  no  systematic  biases,  and  the  standard  deviations 


Crystal-Structure  Refinement  257 


Table  5.  Chi-square  (f1)  agreement  statistics  for 
individual  parameter  estimates  across  experiments — 
stem/ leaf  displays  of  y=(7/2)1/2  loge  (x2/7) 


Recreated 


Extinction  Robust/Resistant 


Heavy  atoms :  Position  parameters 


—3 

—2 

87 

7 

— 1 

7 

3 

— 0 

996110 

95 

8765 

0 

011133466789 

1222233346 

001222477 

1 

033455779 

00012246778 

003456677 

2 

5 

116 

0115 

3 

4 

0 

2 

0 

Median  0-6  (1-17) 

0-6  (1-17) 

0-7  (1-21) 

Diagonal  thermal  vibration  parameters 

— 1 

5 

-0 

7400 

0 

6777 

9 

1 

0245667789 

179 

04679 

2 

0233566 

0056779 

67779 

3 

0246 

002222233467899 

0124568899 

4 

00124 

01346668 

5 

5 

Median  1-65  (1-54) 

3-2  (2-35) 

3-55  (2-58) 

Off-diagonal  thermal  vibration  parameters 

Recreated 

Extinction 

Robust/Resistant 

-2  900 

-1  98532 

8521100 

53210 

-0  99996653333220 

9822 

9855553 

0  0011677 

01223377789 

4455789 

1  5 

0133778 

000255 

2 

0 

22224 

Median  -  0-4  (0-90) 

0-25  (1-07) 

0-5  (1-14) 

C.  S.— 17 
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Table  5 — ( Contd .) 

Hydrogen  atoms :  position  parameters 


1 

0  40 
0  79 

1  599 

2  03678 

3  1 

4  17 

5  5 

6  67 

7 

8 
9 


8 

112468 

1355578 

368 

3 


1 

40 

2 

04778 

0124678 


0 

0 


Median  2-45  (1-92)  1-4  (1-45)  1-9  (1-66) 


Thermal  vibration  parameters 
Median  0-40  (Ml)  0-88  (1-27)  1-02(1-31) 


of  the  individual  parameter  estimates  are  correct.  A 
shift  toward  large  positive  values  indicates  significant 
disagreement  among  parameter  estimates. 

A  stem  and  leaf  display  is  basically  a  histogram 
with  an  additional  unit  of  precision  carried  by  the 
plotting  symbol.  Using  the  recreated  heavy  atom 
position  parameter  display  for  illustration,  the  three 
smallest  Y  values  are  —  0-9,  —  0-9  and  —  0-6,  while 
the  two  largest  values  are  2-5  and  3-0.  The  1 13  model 
parameters  are  grouped  by  type  in  Table  5.  For  easy 
comparison  the  Y  units  scale  is  kept  constant  within 
a  parameter  type.  Thus,  for  example,  the  units  range 
for  heavy  atom  position  parameters  is  —  3  to  +  4. 

For  the  heavy-atom  position  parameters  there  is 
little  difference  in  the  distribution  of  agreement 
statistics  for  the  three  types  of  refinements,  with  a 
slight  shift  toward  positive  values  in  all  cases.  To 
avoid  undue  influence  from  outliers  we  use  the  median 
to  summarize  this  shift.  The  number  in  parenthesis 
next  to  the  stem-and-leaf  median  value  is  the  corres- 
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ponding  value  of  (x~IK)ln,  which  is  a  typical  ratio  of 
parameter  estimate  error,  as  measured  by  the  spread 
among  experiments,  to  refinement  standard  deviation. 
Thus,  the  standard  deviation  of  heavy-atom  position 
parameters  suggest  systematic  differences  of  about 
20%  of  the  standard  deviation  for  all  refinement 
methods.  For  heavy-atom  diagonal  thermal  vibration 
parameters  introduction  of  extinction  in  the  model 
has  a  strong  influence  on  the  spread  of  parameters 
among  experiments.  Our  hope  was  that  including 
extinction  in  the  model  would  improve  parameter 
agreement,  in  the  sense  that  it  would  only  affect  the 
parameter  estimates  in  experiments  with  large  extinc¬ 
tion  effects.  Clearly,  extinction  has  increased  the 
spread  of  parameter  estimates.  There  must  be  major 
model  inadequacies,  involving  correlation  between 
extinction  and  thermal  parameters,  that  are  incon¬ 
sistent  across  experiments.  Clearly  the  standard 
deviations  computed  in  the  extinction  and  robust/ 
resistant  refinements  tell  us  little  about  the  accuracy  of 
diagonal  thermal  vibration  parameters. 

For  off-diagonal  thermal  vibration  parameters  the 
refinement  standard  deviations  are  in  good  agreement 
with  the  total  spread  of  the  parameters  across  experi¬ 
ments.  There  are  no  strong  suggestions  of  systematic 
effects  in  these  parameters. 

For  hydrogen  atom  position  parameters  inclusion 
of  extinction  eliminates  a  few  wild  recreated  refine¬ 
ment  parameter  estimates.  The  robust/resistant 
refinement  introduces  several  wild  estimates.  With  a 
factor  of  about  1-5  for  extinction  and  robust/resistant 
refinements,  there  are  still  systematic  effects  for 
hydrogen  atom  position  estimates  which  are  not 
related  to  extinction.  Stem-and-leaf  displays  are  not 
too  informative  for  the  six  hydrogen  atom  thermal 
vibration  parameters.  The  estimates  are  essentially 
meaningless  because  of  the  large  standard  deviation 
estimates,  which  hide  all  but  the  most  extreme  syste¬ 
matic  effects.  For  six  of  the  eight  good  experiments 
robust/resistant  standard  deviation  estimates  are 
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approximately  10%  smaller  than  comparable  extinc¬ 
tion  ones.  For  experiments  5  and  14  they  are  4% 
larger.  This  slight  systematic  effect  accounts  for  the 
small  increase  in  the  typical  ratio  values  for  the 
robust/resistant  agreement  statistics  over  that  for 
extinction.  Thus,  on  an  absolute  basis  there  are  no 
major  differences  between  extinction  and  robust/ 
resistant  parameter  agreement. 

Our  overall  conclusion  from  Table  5  is  similar  to 
those  of  HA  and  Mackenzie  (1974).  There  are  strong 
systematic  effects  in  at  least  some  of  the  experiments 
which  invalidate  the  refinement  standard  deviation 
estimates  as  measures  of  the  overall  precision  of 
parameter  estimates.  The  situation  is  worst  for  heavy- 
atom  diagonal  thermal  vibration  parameters,  where, 
typically  the  refinement  standard  deviation  estimates 
are  low  by  a  factor  of  2-5.  For  hydrogen  position 
parameters  the  standard  deviations  are  low  by  a 
factor  of  1-5,  and  for  hydrogen  thermal  vibration 
parameters  the  standard  deviations  are  low  by  a 
factor  of  1-3.  Further,  the  introduction  of  extinction 
in  the  model  causes  diagonal  thermal  vibration 
parameters  to  be  more  variable.  Lastly,  there  are  no 
significant  differences  between  extinction  and  robust/ 
resistant  parameter  agreement. 

In  an  attempt  to  get  a  better  understanding  of  the 
nature  of  the  systematic  effects  we  examined  the 
actual  changes  in  the  113  structural  parameters 
(excluding  extinction  and  scale,  which  are  sample 
dependent)  for  the  three  refinement  procedures  and 
the  eight  ‘good’  experiments.  The  results  varied 
widely  among  experiments.  However,  except  for  a 
possible  tendency  for  carbon  and  oxygen  to  move  by 
small  amounts  in  opposite  directions  when  the  extinc¬ 
tion  parameter  was  introduced,  the  only  consistent 
trend  was  for  the  values  of  the  heavy  atom  diagonal 
thermal  parameters  to  increase  in  the  extinction  re¬ 
finement.  This  is  a  result  of  the  well-known  tendency 
for  extinction  to  depress  the  intensities  of  low  angle 
reflections  more  than  high  angle  ones.  Experiment  5 
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stands  out  because  it  appears  to  show  very  little 
extinction,  and  the  shifts  of  all  parameters  were 
generally  less  than  one  standard  deviation.  Experi¬ 
ment  4,  on  the  other  hand,  had  large  shifts  in  both 
refinement  stages.  Experiments  9  and  14  also  had 
relatively  small  changes,  but  experiment  14  differed 
from  all  the  others  in  that  the  increase  in  the  diagonal 
thermal  parameters  of  the  heavy  atoms  occurred  in 
the  robust/resistant  refinement.  The  results  agree 
with  the  conclusion  of  Mackenzie  (1974)  that  extinc¬ 
tion  was  an  important  source  of  variability  among 
experiments.  It  is  clear,  however,  that  there  are  other, 
still  unidentified,  systematic  effects  that  influence  the 
different  experiments  in  different  ways. 


Conclusions 

The  data  from  the  Single  Crystal  Intensity  Project 
clearly  contain  systematic  effects  that  differ  among 
the  experiments.  We  note  that  when  these  systematic 
effects  are  minimal,  or  at  least  are  uniform  in  the  full 
data  set,  the  robust/resistant  refinement  agrees  well 
with  the  classical,  fully  weighted,  least-squares  refine¬ 
ment.  Even  when  there  is  good  agreement,  however, 
the  variation  of  parameter  estimates  across  experi¬ 
ments  is  greater  than  would  be  expected  from  the 
standard  deviations  calculated  by  the  refinement 
program.  This  indicates  that  the  calculated  standard 
deviations  are  likely  to  represent  an  overly  optimistic 
assessment  of  accuracy  unless  the  data  set  has  been 
determined,  by  some  independent  criterion,  to  be 
free  of  systematic  effects. 

Several  general  conclusions  may  be  drawn  from 
the  results  of  this  study.  First,  the  robust/resistant 
algorithm  converges  in  some  cases  where  the  classical, 
Newtonian  procedure  does  not.  This  is  a  problem  in 
numerical  analysis  rather  than  in  statistics.  While 
there  are  other  numerical  algorithms  that  are  more 
stable  than  Newton’s  method  (see,  for  example, 
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Broyden,  1972),  the  robust/resistant  algorithm  is 
easy  to  implement,  and  has  other  side  benefits.  Second, 
with  a  good  data  set  the  robust/resistant  procedure 
gives  parameter  estimates  close  to  those  given  by  the 
classical  procedure,  and  there  is  a  strong  indication  of 
important  systematic  effects  if  the  two  procedures 
disagree.  Of  course,  with  real  data  one  cannot  deter¬ 
mine  which  analysis  is  closer  to  the  ‘correct’  structure, 
because  it  is  not  known  what  the  correct  structure  is. 
Some  comparison  can  be  done  with  a  synthetic  data 
set  constructed  from  a  known  model  (Prince  & 
Nicholson,  1982).  Third,  Huber’s  (1973)  procedure 
for  estimating  the  standard  deviation  of  a  unit  weight 
observation  and  the  standard  deviations  of  parameter 
estimates  gives  results  that  are  close  to  the  results 
given  by  the  classical  procedure  if  the  model  is  ade¬ 
quate. 

These  conclusions  are  similar  to  those  obtained  by 
statisticians  in  using  robust/resistant  approaches  in 
the  fitting  of  other  types  of  data  (Andrews,  1974; 
Cook,  1977;  Mallows,  1979).  The  robust/resistant 
analysis  reproduces,  by  a  more  or  less  objective  proce¬ 
dure,  the  results  of  a  classical  analysis  done  with  a 
gocd  deal  of  hand  screening  of  individual  observa¬ 
tions.  It  identifies  those  particular  data  points  that 
are  most  inconsistent  with  the  model,  which  can  be 
used  as  a  starting  point  for  examining  both  model 
and  data  to  determine  the  source  of  the  discrepancy. 
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Calculation  of  the  Electron-Density 
Distribution  with  an  Account  of  Statistical 
Errors  in  Structure  Amplitudes  and  Series 
Termination 

By  A.  A.  Shevyrev  and  V.  I.  Simonov 

Institute  of  Crystallography,  Academy  of  Sciences  of 
the  USSR,  Moscow,  USSR. 


Abstract 


On  calculating  the  electron-density  distribution  in 
crystals  it  is  desirable  to  eliminate  statistical  errors 
in  the  observed  moduli  of  structure  amplitudes  and 
to  smooth  out  the  effect  of  the  Fourier-series  termi¬ 
nation.  Of  special  importance  is  the  location  of  light 
atoms  in  the  presence  of  heavy  ones  in  the  structure 
as  well  as  the  calculation  of  the  difference-density 
distribution.  Proceeding  from  the  mathematical 
methods  of  stable  Fourier-series  summation  used 
when  the  Fourier-series  coefficients  are  not  free  from 
statistical  errors  (Tikhonov  &  Arsenin,  1974),  the 
following  expression  has  been  derived 


^wH^HexP  2?r  /  Hr] 
H 


where 

=  0  ,  if  |Fh|  <Pv\fh\- 


The  parameter  /?  depends  on  the  error  distribution 
in  |  Fh  |obs.  If  the  errors  follow  the  Gaussian  distribu¬ 
tion,  ]8  =  2  is  recommended. 
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The  use  of  special  factors  was  suggested  to 
smooth  out  the  Fourier-series  termination  waves 
(Lantsosh,  1961) 

p ^  =  °h  Fh  exp  277  *  Hrl 

H 

In  the  case  of  three-dimensional  series,  the  a 

H 

factors  have  the  form 

_  sin [irhl(H+  l)]sin[7rkl(K+  l)]sin[w //(£+ 1)] 

~  “  ^hkll[(H+l)(K+l)(L+\)}  ’ 

where  hkl  are  the  usual  indices  of  the  corresponding 
structure  amplitudes,  whereas  the  values  of  HKL 
depend  on  the  limits  of  the  observed  set  of  Fhkl  and 
are  determined  for  each  structure  amplitude  in  the 
following  way 

H  —  max  h  for  the  given  k,  1  and  h  >  0 

H  —  max  |  h  |  for  the  given  k,  l  and  h  <  0. 

The  values  of  K  and  L  are  determined  by  the  same 
method.  Thus,  the  <rH  factors  are  peculiar  to  each 
Fourier  coefficient  and  depend  on  its  indices  and 
on  the  used  set  of  structure  amplitudes.  The  full 
paper  gives  examples  of  practical  application  of  the 
above-mentioned  methods,  and  is  published  else¬ 
where  (Shevyrev  &  Simonov,  1981). 
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On  Data  Reduction  and  Error  Analysis 
for  Single-Crystal  Diffraction  Intensities 


By  Robert  H.  Blessing  and  George  T.  DeTitta 

Medical  Foundation  of  Buffalo,  Inc.,  73  High  Street, 
Buffalo,  NY  14203,  USA 

Abstract 

Crystallographic  studies  aimed  at  detailed  mapping  of 
the  electron  density  in  molecules  and  crystals  require 
unusually  careful  efforts  to  eliminate  systematic 
experimental  errors  and  to  recognize  and  minimize 
random  errors.  Several  methods  for  estimating  Bragg 
peak  limits  in  step-scanned  reflection  profiles  have 
been  developed:  minimization  of  <r(/)//  (Lehmann 
&  Larsen,  1974);  location  of  the  changes  from  decrea¬ 
sing  peak  intensity  to  ‘probably  constant’  background 
intensity  (Grant  &  Gabe,  1978);  and  minimization  of 
an  autoconvolution  of  the  intensity  profile  (Rigoult 
1979).  These  methods  become  less  reliable  as  peak-to- 
background  values  diminish,  but,  given  limits  for  a 
suitable  sample  of  the  prominent  peaks  in  a  data  set, 
anisotropic  reflection  width  parameters  can  be  found 
by  least-squares  fit  and  used  to  calculate  peak  limits 
for  even  the  weakest  reflections.  To  observed  base 
widths  W1  and  W2  below  and  above  the  centroids  of 
the  ‘good’  peaks,  we  fit  coefficients  qijk  and  Tt 
according  to 

Wt  =  Qi  +  Tt  tan0,  i  =  1,  2, 

3  3 


7=1  k  =  l 

The  quantities  z}  are  components  along  crystal-fixed 
Cartesian  axes  of  a  unit  vector  normal  to  the  incident 
and  diffracted  beams.  For  diffractometer  axes  defined 
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as  in  the  International  Tables  for  X-ray  Crystallo¬ 
graphy  (Vol.  IV,  pp.  276-278),  the  z}  are  given  by 

Oi,  z2.  zz)  —  (sin  </•  sin  X>  cos  ^  sin  X>  00:5  x)- 

Our  estimates  of  a1  (I)  include  contributions  from 
(1)  the  Poisson  variance  of  the  stepwise  count  rates, 
corrected  for  coincidence  losses;  (2)  the  variance  of 
the  measured  dead  time  of  the  counting  chain,  and 
the  variance  of  the  correction  factor  for  the  beam 
attenuator,  if  used;  (3)  the  variances  and  covariance 
of  the  parameters  of  a  straight  line  fitted  to  the 
background;  (4)  the  variances  and  covariances  of 
the  parameters  of  polynomial  scaling  functions  of 
X-ray  exposure  time  fitted  to  the  periodically 
measured  reference  intensities;  (5)  the  mean-square 
deviation  from  the  mean  of  the  scaling  factors  derived 
from  these  functions;  and  (6)  the  instrumental 
variance  p2/2  (McCandlish,  Stout  &  Andrews,  1975). 

Since  the  Ottawa  meeting  we  have  discovered  a 
more  correct  formulation  for  the  widths  of  the  Bragg 
peaks  as  an  anisotropic  property  of  the  specimen 
crystal  (cf.  Nelmes,  1980): 

WJ  =  Q,  +  Tt[ tan  0(a)]2,  /=1,2 

where  Wx  is  the  base  width  of  the  half  peak  below 
8(af)  and  fV2  is  the  base  width  of  the  half  peak  above 
0(a2),  and  the  Qt  are  as  defined  above. 

The  research  was  supported  by  NIH  Grant  No. 
AM-19856.  The  full  paper  will  be  submitted  to  the 
Journal  of  Applied  Crystallography. 
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On  the  Problem  of  Secondary  ‘Least- 
Squares  ’  Minima 

By  R.  Rothbauer 

IBM  Thomas  J.  Watson  Research  Center,  Yorktown 
Heights,  New  York  10598,  USA 


Abstract 

The  method  of  least  squares  defines  the  solution  of  a 
system  of  overdetermined  physical  equations  by  an 
extremal  principle.  As  an  immediate  consequence  of 
this  arises  the  problem  of  secondary  minima  in  the 
related  numerical  solution  algorithms.  We  therefore 
develop  an  alternative  to  the  extremal  principle  which 
is  not  affected  by  secondary  minima.  It  will  be  shown 
that  the  solutions  defined  by  the  new  principle  differ 
only  negligibly  from  those  derived  by  the  method  of 
least  squares  if  the  accuracy  of  the  experiment  and 
of  the  underlying  physical  theory  are  in  accordance 
and  that  the  same  statement  holds  as  well  for  the 
variance-covariance  matrix  and  for  the  ‘error  of  an 
observation  with  unit  weight  ’.  It  will  also  be  shown 
that  the  results  obtained  from  both  principles  coincide 
exactly  if  the  system  of  physical  equations  is  linear. 


Outline 

The  theories  of  physics  describe  nature  by  irrational 
numbers  representing  lengths,  times,  charges,  forces, 
etc.  Considering  any  special  object,  some  of  these 
quantities  are  in  general  simply  defined  by  procedures 
of  measurement,  while  another  part  is  related  to  the 
observations  by  more  or  less  complex  theories.  The 
physical  quantities  of  any  object  under  consideration 
may  therefore  roughly  be  separated  into  observational 
parameters  and  model  parameters,  some  of  the  latter, 
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*oi>  xo&  •••>  *op>  say,  being  in  general  determined  by 
some  of  the  first,  yQ1>  y02,  ...,  >>0n,  say,  and  a  system  of 
equations 


b  gj  (joi,  ^02,  •  •  -  jJW  ^Ol’  •  •  •»  X0p)’J  1»2,  ...,/  (1) 

arising  from  the  laws 

Q=gj(yi»y»>  •••> yn>xi,x2, ...,  xp),  y=i,2, ...,/  (2) 

of  a  theory  applied  to  the  object,  which  states,  that 
in  an  ^-dimensional  space  B  of  observations  [yv 
y2, ...,  only  a  m-dimensional  subspace  N  is  to  be 
observed.  If  the  equations  (2)  are  independent 

m  —  n  +  p  —  l.  (3) 

Although  the  theory  treats  the  observables  as  sharp 
irrational  numbers,  they  are  in  practice  defined  by  a 
prescription  of  measurement  and  are  thus  necessarily 
of  a  certain  unsharpness,  which  is  commonly  assumed 
to  be  caused  by  experimental  errors,  distributed 
according  to  a  ‘law  of  errors’  if  the  experiment  is 
repeated.  The  error  concept  is  used  to  explain  the 
fact  that  any  result  [y01,  y02,  v0„]  of  an  experi¬ 

mental  investigation  will  almost  never  belong  to  the 
subspace  N  of  the  space  B  of  observations  as  required 
by  the  theory,  but  will  be  situated  somewhere  in  its 
neighbourhood,  which  means,  that  the  system  of 
equations  (1)  has  generally  no  exact  solution. 

What  properly  should  be  understood  as  a  solution 
in  the  sense  of  physics  is  developed  in  the  form  of  an 
extremal  principle  by  the  ‘method  of  least  squares’, 
which  describes  how  the  theories  of  physics  are  to  be 
applied,  and  hence  is  of  enormous  importance.  |S| 

The  ‘method  of  least  squares’  takes  the  theory, 
eq.  (2),  for  granted  in  its  full  sharpness,  assumes  that 
the  observations  [y01,  y0 2,  ...,  yQn],  are  affected  by 
errors,  which  are  supposed  to  be  distributed  with 
finite  standard  deviations  [ax,  <r2, ...,  crn]  and  concludes 
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that  the  best  approximate  solution  [x01,  x02,  -x0p] 

of  (1)  is  defined  by  the  point  [y’ov  y'oz,  ...,  /0„]  on 
the  m-dimensional  subspace  of  the  theory — where 


o  —  gj(y  01’  y  02’  •  • ' ’  y  On,  -*01’  *02’  '  •  •  ’  x0p)’ 

J=  1,2,...,  1 


is  mathematically  solvable — for  which 
n 

^  Oto  -  y'oifh)  =  minimum.  (4) 
7=1 

Detailed  discussions  may  be  found  in  the  textbooks. 

In  many  cases  of  practical  interest  the  subspace  N 
of  the  theory,  described  by  equation  (2),  and  the 
observation  [y01,  _y02,  ...,  y0n],  is  of  such  a  kind  that, 
besides  the  main  minimum  of  (4),  there  exist  a  great 
number  of  secondary  local  minima  without  physical 
significance,  where  the  search  algorithms  for  the 
unknown  model  parameters  of  the  ‘method  of  least 
squares’,  starting  from  some  guess  to  be  made,  may 
end  in  practice  without  a  correct  solution,  if  no 
approximation  close  enough  to  the  main  minimum  is 
known  in  advance. 

The  problem  of  secondary  minima  severely  restricts 
the  possibilities  of  applying  the  ‘method  of  least 
squares’.  It  is — as  we  will  see  later — closely  related 
to  an  inconsistency  in  its  propositions,  which  are 
unsharp  observations  and  theories  formulated  by 
strictly  valid  equations. 

The  assumption  that  the  observational  quantities 
are  subject  to  probabilistically  distributed  errors 
implies  that  the  probability  of  an  observation,  which 
verifies  a  theory  of  type  (2)  with  m  <  n,  is  zero.  In 
other  words,  the  probability  of  a  ‘least  squares’  solu¬ 
tion  is  infinitely  small  (Gauss,  1839). 

If  the  theories  of  physics  are  established  from  obser¬ 
vations  by  the  induction  principle,  they  may  only 
be  described  by  equations  if  one  allows  for  some 
unsharpness. 
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In  the  forthcoming  paper  we  will,  because  of  this, 
drop  the  assumption  of  laws  described  by  strictly 
valid  equations  (2).  On  this  basis  we  will  develop  two 
alternative  definitions  for  the  solution  of  (1),  which 
are  not  affected  by  the  problem  of  secondary  minima. 
We  will  show  that  the  deviations  of  the  solutions  given 
by  these  definitions  from  those  given  by  the  extremal 
principle  of  the  ‘method  of  least  squares’  are  negligible 
if  the  problem  (1)  is  physically  relevant.  In  particular 
it  will  appear  that  all  three  definitions  coincide  if  the 
system  of  equations  (1)  is  linear. 

The  full  paper  is  available  as  IBM  Research  Report 
RC  9045  (39595)  Mathematics,  and  will  be  submitted 
for  publication  elsewhere. 
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Wiener  Methods  for  Electron  Density 

By  D.  M.  Collins  and  M.  C.  Mahar 

Department  of  Chemistry,  Texas  A  &  M  University, 
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Abstract 

The  Wiener  formalism  is  widely  used  in  applications 
conveniently  categorized  as  smoothing,  interpolation, 
or  extrapolation  of  stationary  series.  The  present 
application  is  of  the  last-mentioned  type  and  consists 
in  extrapolation  of  a  set  of  structure  factors  (phases 
and  magnitudes)  beyond  the  experimental  (20)  limit 
of  data  to  increase  resolution  in  the  corresponding 
density  function.  The  application  has  in  view  cases 
for  which  data  are  severely  curtailed  in  angular  range, 
but  not  necessarily  in  number.  Biological  macro- 
molecular  structure  problems,  though  beyond  reach 
at  present,  fit  the  application  exactly  and,  in  fact, 
were  the  target  from  the  beginning. 

Suppose  a  set  of  structure  factors,  spherically 
complete  for  some  range  of  |  h  j  including  |  h  |  =  0. 
Now  an  estimation  of  electron  density  at  higher 
resolution  than  nominally  provided  by  the  original 
structure  factors  is  found  by  solving  the  matrix 
equation 


fc  -  a. 


for  C  and  computing 

p(r)  ~  «•/ 1  .£C(h)  exp{—  27ri  hr} 12, 

where  k  is  a  collection  of  constants,  and  the  summa¬ 
tion  is  over  a  half-lattice  and  its  origin  at  which 
C  =  1-0.  The  determinant  |  F  |  is  of  the  general  Karle- 
Hauptman  type  but  with  entries  restricted  by  the  half- 
C.  S.— 18 
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lattice  condition.  The  lead  element  of  /3  is  positive, 
the  others  are  zero. 


1.  Introduction 

Norbert  Wiener’s  (1949)  Extrapolation,  Interpolation, 
and  Smoothing  of  Stationary  Time  Series  provides  the 
basis  for  the  material  presented.  It  should  be  noted  at 
the  outset  that  other  parallel  work  has  led  to  common 
attribution  of  prediction  theory,  which  emphasizes 
the  aspect  of  extrapolation,  to  both  Kolmogorov  and 
Wiener  as  originators  ( Encyclopedic  Dictionary  of 
Mathematics,  1977).  In  recognition  of  this,  we  shall 
have  occasion  to  refer  to  Wiener-Kolmogorov  (\vk, 
hereafter)  prediction  theory  but  it  is  the  work  of 
Wiener  which  lies  behind  much  of  what  follows. 

It  was  Wiener’s  (1949)  purpose  to  bring  together 
the  theory  and  practice  of  two  fields  of  diverse  tradi¬ 
tion,  communication  engineering  and  time  series  in 
statistics.  In  the  latter  field  algebraic  relationships, 
especially  those  involving  correlations,  are  used  to 
find  desired  results  which  are  best  in  some  average 
sense.  Communication  engineering  has  the  special 
concern  of  discovering  the  stability  of  oscillatory 
systems  and  whether  their  oscillations  die  out  or  grow 
without  bound.  The  present  crystallographic  applica¬ 
tion  employs  Wiener’s  synthesis  of  the  different 
techniques  but  in  simplified  forms  worked  out  pri¬ 
marily  by  geophysicists  ( cf  Robinson  &  Treitel, 
1980). 

Wiener’s  (1949)  synthesis  of  techniques  and  the 
various  forms  which  result  are  based  upon  the 
(one-dimensional)  Fourier-transform  pair.  The  appro¬ 
priate  relationships  are 

p(x)  =  F(w)  exp  277 -iuxf  du,  (1) 

which  defines  F(it),  and  F(u)  is  given  by 

F(u)  =  p(x)  exp  \l7riuxf  dx.  (2) 
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In  crystallographic  application  the  continuous  but 
periodic  electron  density  p  requires  a  discretely 
sampled  structure  factor  F  and,  in  fact,  the  current 
fast  Fourier- transform  computer  algorithms  require 
both  p  and  F  to  be  discretely  sampled  on  regular 
grids  (Gentleman  &  Sande,  1966)  for  numerical 
calculations.  Relaxation  of  the  customary  presump¬ 
tion  that  Fourier  analysis  is  to  be  restricted  to  the 
axis  of  real  values  of  *  constitutes  an  apparent  compli¬ 
cation  as  p  thus  becomes  a  function  in  the  complex 
plane.  But  it  is  advantageous  to  move  off  the  axis 
of  reals  into  the  complex  plane  because  p  then  can  be 
subjected  to  the  full  power  of  analytic-function  theory. 
This  is  the  approach  of  Wiener  and  in  his  book-length 
analysis  he  presents  the  detail  necessary  for  mathe¬ 
matical  rigour,  but  with  emphasis  upon  functions 
which  are  continuous  rather  than  discretely  sampled 
as  in  crystallographic  applications. 

The  present  book  deals  with  statistics  in  a  very 
general  sense  and  it  is  worth  emphasizing  that  struc¬ 
ture  factors  (both  amplitude  and  phase)  are  used 
quite  freely  in  this  part  as  numerical  facts,  as  statistics. 
The  emphasis  is  not  on  algebraic  relationships  between 
structure  factor  and  electron  density,  but  on  the 
density  itself,  or  at  least  one  of  its  roots,  as  a  function 
of  definite  character  in  agreement  with  the  available 
statistical  information,  the  structure  factors.  The 
criterion  of  agreement  is  the  statistical  measure  of 
mean-squared-error  whose  functionally  constrained 
minimization  is  often  the  means  to  a  desired  result. 

Gassmann  (1977)  discussed  Wiener  filtering  as 
structure  determination  by  the  filtering  of  noisy 
images.  In  that  paper  a  filter  was  given  for  which  the 
denominator  is  a  Patterson  coefficient.  From  an 
algebraic  point  of  view  it  is  necessary  only  to  make 
certain  that  no  individual  Patterson  coefficient  is 
zero  in  order  that  the  filter  be  valid.  In  functional 
analysis  the  proposal  of  such  a  filter  constitutes  a  very 
daring  assertion  indeed.  It  is  the  relevant  analysis  to 
which  Wiener  (1949)  had  addressed  himself  and  from 
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which  the  form  and  power  of  his  methods  are  derived. 
Functions  free  of  zeros  and  analytic  along  an  axis  of 
reals  and  in  a  contiguous  region  of  the  complex  plane 
have  the  central  role  in  Wiener’s  work.  Let  it  be 
observed  that  in  each  of  the  following  applications, 
including  the  connections  with  information  theory, 
a  result  always  turns  upon  implicit  or  explicit  dis¬ 
covery  of  functions  in  ^-dimensional  generalization 
which  are  similar  to  those  used  by  Wiener. 


2.  Wiener’s  problem  and  its  formal  solution 


Wiener’s  (1949)  simplest  and  basic  problem  was  the 
extrapolation  or  prediction  of  time  series.  A  predic¬ 
tion,  of  course,  can  never  be  a  perfect  continuation 
of  a  series  for  such  a  state  of  affairs  would  preclude 
the  possibility  of  new  information  becoming  available. 
Nevertheless,  a  series  is  subject  to  statistical  predic¬ 
tion  and  the  problem  may  be  posed  as  the  prediction 
or  estimation  of  data  not  yet  measured.  The  series  to 
be  treated  are  assumed  real  (for  convenience)  and 
stationary  which  in  regard  to  the  autocorrelation 

N 

</>  =  lim  — ? —  7  xt+T  xt,  (3) 

T  a^oo2A+1  Z, 

t=—N 


may  be  interpreted  to  mean  not  only  that  d>  exists,  as 
it  must  for  a  physical  process,  but  also  that 


N 


6  =  lim 

rT 


1 


N->oo  2 N -f- 1 


Xf-Q+r  Xt-Q> 


(4) 


t——N 


where  Q  is  arbitrary  and  <£T  is  entirely  unaffected  by 
its  specification.  It  will,  of  course,  be  most  convenient 
to  take  Q  =  0  but  the  formulation  makes  it  clear 
that  an  autocorrelation  is  independent  of  the  time 
origin  for  a  stationary  series. 

Suppose  a  series  known  for  all  past  time  and  trun- 
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cated  at  t.  The  prediction  problem  is  solved  upon 
successful  estimation  of  future  series  members  as 

xt+a  —  a0xt  +  a1  xt_L  +  a2  x(_2  +  •  •  • ,  (5) 

for  which  there  is  a  minimum  in  the  formal  expres¬ 
sion 


N 

nZ  2  I  Is-  (6) 

t=-N 

It  is  clear  that  the  coefficients  a  of  a  linear  prediction 
operator  are  generally  dependent  upon  the  prediction 
span  a.  For  any  a  the  prediction  operator  is  obtained 
by  solving  the  equations  resulting  from  minimization 
of  the  time  expectation 

00 

I  =  E  j[.Yt+a  -  ^  xr_s  as  (a)]2j,  (7) 

s  =  0 

with  respect  to  as  (a).  The  equations  to  be  solved  are 

00 

2  as  (“)  ^T-s  =  <}>T +a  ;  T  =  0,  1 ,  2,  . . .  (8) 

5  =  0 

and  the  resulting  wk  linear  predictor  may  be  used  in 
(5)  to  obtain  estimates  of  data  not  yet  measured. 

An  obvious  and  important  feature  of  (8)  is  that  the 
wk  linear  predictor  depends  only  upon  the  auto¬ 
correlation  of  a  process.  The  application  of  a  predic¬ 
tion  operator  is  independent  of  origin  definition  in 
time  and,  in  fact,  is  even  independent  of  the  choice 
of  time  series  itself  so  long  as  it  is  chosen  from  the 
infinity  of  series  with  autocorrelation  coefficients 
equal  to  those  used  in  (8).  Although  the  summation 
index  in  (8)  takes  on  only  non-negative  values,  it 
could  take  on  all  values  because  by  construction  a 
is  a  causal  or  one-sided  function  which  vanishes  for 
negative  values  of  s. 
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If  one  had  a  reasonably  large  set  of  autocorrela¬ 
tion  coefficients  without  lacunae,  (8)  would  lead  at 
once  to  an  approximation  of  (5)  and  a  formally  satis¬ 
factory  solution  to  the  simple  prediction  problem. 
Alternatively,  Fourier  coefficients  for  a  might  be 
sought  for  application  in  the  dual  space  through 
transformation  of  (8).  Either  approach  is  proble¬ 
matic  because  a  physical  process  and  its  autocor¬ 
relation  are  not  measured  simultaneously.  This 
practical  principle  of  complementarity  dictates  the 
inaccessibility  to  physical  measurement  of  either 
according  to  the  measurability  of  the  other.  The 
common  side  of  the  principle  is  that  for  which  a 
process  is  measured  and  its  autocorrelation  must  be 
approximated  by  some  method  of  spectral  estimation 
(Oppenheim  &  Schaefer,  1975).  This  is  a  very  sub¬ 
stantial  problem  and  wk  prediction  often  founders 
upon  the  difficulty  of  autocorrelation  determination. 

Crystallographic  application  (Collins,  1978)  involv¬ 
ing  extrapolation  in  reciprocal  space,  hence  resolution 
enhancement  in  direct  space,  is  a  problem  of  the  other 
sort.  Electron-density  must  be  positive  definite  and 
may  therefore  be  written  as 

p=|«|2>0  (9) 

with  complete  generality.  Fourier  transformation  of 
(9)  shows  the  autocorrelation  of  G,  the  transform  of 
g,  is  F  and  the  resolution-enhancement  problem  is 
seen  to  begin  with  the  knowledge  of  an  autocorrela¬ 
tion  and  the  need  for  its  deconvolution.  But  the 
difficulties  of  wk  extrapolation  are  not  removed  by 
this  change  of  complexion.  The  deconvolution  of  F 
is  as  much  a  problem  as  the  convolution  of  x  and 
the  extrapolation  of  G  as  uncertain  as  that  of  x  apart 
from  other  information  or  constraints. 

The  needed  constraints  are  provided  in  Wiener’s 
(1949)  development  of  the  wk  linear  predictor  through 
functions  analytic  and  free  of  zeros  and  poles  in  a 
half-plane  and  on  its  (finite)  boundaries.  For  resolu- 
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tion  enhancement  the  corresponding  computations 
would  involve  explicit  extrapolation  of  G  and  subse¬ 
quent  convolution  to  yield  estimates  of  high-resolu¬ 
tion  structure  factors.  As  an  explicit  computation  this 
is  entirely  impractical  but  it  is  achieved  by  implication 
in  §  3. 


3.  Density  and  filters  in  one  dimension 

A.  Principles.  Wiener’s  (1949)  analyses  were  con¬ 
cerned  with  continuous  functions  almost  exclusively. 
For  crystallographic  applications  most  Fourier  trans¬ 
formations  involve  discretely  sampled  functions,  both 
density  and  its  transform.  The  corresponding  implied 
periodicity  in  both  spaces  invites  Fourier  analysis 
upon  the  unit  circle  or  its  M-dimensional  generaliza¬ 
tion,  the  unit  polycylinder.  An  immediate  practical 
result  is  the  z-transform  obtained  for  time  series  by 
replacing  exp  (—  2-nif )  with  z.  Then  a  Fourier  trans¬ 
form  and  the  related  z-transform  are 


co 


(11) 


While  (11)  has  been  written  with  |z|=  1,  it  is 
clear  that  restriction  of  X  to  be  on  the  unit  circle 
is  necessary  only  to  preserve  its  nature  as  a  Fourier 
transform.  X  (z)  is  a  Laurent  series  at  every  (finite) 
point  in  the  complex  plane  and  upon  the  unit  circle 
is  the  periodic  Fourier  transform  of  x.  Let  it  be 
observed  that  the  change  of  variable  maps  the  half¬ 
plane  and  its  boundaries  over  which  Wiener’s 
functions  would  be  free  of  zeros  and  poles  to  the 
unii  circle  and  the  interior  domain  which  it  bounds. 
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Error  of  extrapolation  [cf.  equation  (5)]  is  readily 
composed  for  a  =  0  as 

/-I 

AS. 

et  =  Xt  —  X,  —  xt  — 

s=  — oo 

1  —  a  =  (1,  -  a0,  —  av  ...)  =  y.  (13) 

It  is  assumed  that  only  x  is  known  and  that  through 
its  Fourier  transform  it  satisfies  the  Paley-Wiener 
condition  (Collins,  1978;  Robinson,  1967)  which 
requires  0  <  |  X(f )  [  <  oo.  Determination  of  y  is  to 
be  driven  by  constraints  upon  it  and  upon  the  error 
process  e.  An  initial  motivation  for  determination  of 
the  unit-extrapolation-error  (or  prediction-error)  filter 
y  is  its  application  by  (13)  and  (5)  to  unit  extrapola¬ 
tion  of  x:. 

Another  equally  important  use  for  y  follows  from 
the  constraint  placed  upon  e.  What  seems  the  most 
cogent  constraint  upon  e  in  (12)  is  that  the  error 
series  should  be  uncorrelated  (Collins,  1978;  Robinson 
&  Treitel,  1980).  If  it  were  not  so,  then  the  filter  y 
would  not  have  extracted  the  stationary  statistical 
features  or  predictability  from  x.  The  constraint  is 
imposed  upon  the  z-transform  of 

00 

et  ~  ^  *S  Yt-S,  (14) 

s=—  oo 

E(z)  =  X  (z)  r  (z),  (15) 

for  which 

I  E  (/)  \2  —  \  X  (f)  |2  j  r  (/)  |2  —  a<ilL’,  (16) 

o-2  is  a  constant  and  L  is  the  period  of /.  This  equation 
expresses  the  requirement  that  e  be  an  uncorrelated 
error  series  for  only  if  this  is  so  is  its  Fourier  trans¬ 
form  of  constant  magnitude  as  required  by  (16). 
Evidently  1/ 1  T  (/)  |2  gives  the  spectral  characteristics 
of  *  and  apart  from  a  constant  factor  completely 
represents  its  correlations  and  their  structure. 


A*  Qf-si  (1^) 


Wiener  Methods  for  Electron  Density  281 


Applications  require  the  practicality  of  finite  filters 
for  which  (14)  may  be  rewritten  as 

t 

et  Yt—s •  (17) 

s=t — n 


The  change  in  upper  limit  is  only  a  formality  as  y  was 
designed  to  be  causal  or  one-sided;  the  change  in 
lower  limit  reflects  an  arbitrary  limit  of  n  +  1  upon 
filter  length  and  the  requirement  that  higher  filter 
elements  be  zero.  Because  there  is  no  limit  to  the 
length  of  x  in  time  past,  (16)  may  be  written  as 


n 


o2JL  =  |  X(f)  |2  |  ^  y,  exp  {- Inis/)} 
s= 0 


(18) 


in  conformity  with  (17).  Presumably  e  does  not  vanish 
and  since  for  a  physical  process  of  the  sort  under 
consideration  0  <  |  X  (/)  |  <  oo,  then 

n 

r (2)  =  ]>  Ys zS  96  °»  \z\=  l’  09) 

5=0 

and  equation  (18)  may  be  rearranged  to 

I  JT (/)!»=__ - HL - ,  (20) 

|  ^  Ys  exp  {  —  Irrisf}  2 

5  =  0 

the  form  in  which  the  spectral  density  of  x  depends 
only  upon  the  causal  extrapolation-error  filter,  apart 
from  a  constant  factor. 

The  extrapolation-error  filter  is  determined  by  use 
of  equations  (8)  and  (13).  A  unit  prediction  span  and 
a  filter  length  of  n  +  1  lead  to 


<f>0  (f>-l  •••  <t>-n 

_4>’i  4*11— \  ...  h  . 


Yo  =  1 

~r 

Yi 

0 

_o_ 

,  (2D 
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as  is  easily  verified.  Because  x  may  be  combined 
with  its  Hermitian  transpose  to  give  <f)  =  Lr 1  xt  x, 
equation  (14)  may  be  used  to  equate  jS  and  the 
variance  of  e  by 

a2  =  1  e+e  =  I  yVx  y  =  y  =  0  >  0.  (22) 

L  L 

Equation  (21)  is  solvable  in  the  ordinary  sense  for  any 
order  n  if  the  matrix  <f)  is  positive  definite.  The  equa¬ 
tion  is  discussed  at  length  in  the  geophysical  literature 
in  which  it  is  shown  to  be  very  generally  solvable, 
even  when  ill-conditioned,  by  a  highly  efficient 
recursion  procedure  based  upon  the  Toeplitz  struc¬ 
ture  of  <f>  (Claerbout,  1976;  Robinson  &  Treitel, 
1980). 

The  extrapolation-error  filter  is  unique  and  in  two 
senses.  First,  within  a  constant  factor  |  T(z)  |2  is  given 
on  the  unit  circle  by  |  X (/)  |-2.  By  function  theory  the 
ordinary  polynomial  (19)  has  on  the  unit  circle  its 
maximum  and  minimum  moduli  in  the  region  bound¬ 
ed  by  the  unit  circle  (Caratheodory,  1964).  Thus 
T(z),  having  no  zeros  on  or  within  the  unit  circle,  has 
all  its  roots  at  |z|>l.  But  with  | r (z) | 2  given  for 
|  z  |  =  1  this  corresponds  to  the  unique  Fejer  factoriza¬ 
tion  in  which  all  and  only  the  roots  of  |  T(f)  |2  outside 
the  unit  circle  are  used  in  the  formation  of  T(z) 
(Collins,  1978).  Evidently  y,  the  extrapolation-error 
filter,  the  Fourier  transform  of  unique  T(/),  is  itself 
unique.  Second,  y  is  a  minimum-delay  operator. 
Consider  some  y  ^  y  for  which  jr'(/)|2  =  |T(/)|2. 
The  minimum-delay  property  is  given  by  Robinson’s 
theorem  (Claerbout,  1976) 

m  m 

2  \y.\‘>2  |y;|2;m  =  0,1,  (23) 

S  =  0  5  =  0 

For  at  least  one  of  the  equations  given  by  (23)  the 
equality  must  fail  if  y  ±  y.  This  minimum-delay 
property  of  unique  y  shows  that  there  is  no  other 
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sequence  of  coefficients  which  with  equal  compactness 
represents  |  r(/)|2,  hence  |  X(f)  |2  and  the  statistical 
structure  of  stationary  x. 

B.  Applications.  Crystallographic  application  to 
resolution  enhancement  in  one  dimension  follows 
from  the  assertion  of  equation  (9)  that  p  =  |  g|2  >  0. 
As  pointed  out  earlier,  this  requires  F  to  be  an  auto¬ 
correlation  and  there  follows  immediately  an  equation 
analogous  to  (21). 


r  F0  F*  ...F*  1 

C0-  1“ 

al  ’ 

^0 

Ci 

= 

0 

I - 

M 

! _ 

_ 

J>  - 

Ck  exp  {  —  Irrikxf  2 

k= 0 


(24) 


(25) 


a  high-resolution  electron-density  estimator  (Collins, 
1978);  L  is  the  period  of  both  p  and  g. 

There  is  a  variety  of  ways  to  see  that  cp  is  a  high- 
resolution  estimator  of  density.  The  most  cogent 
evidence  is  given  in  Robinson’s  theorem.  This  theorem 
requires  that  C  be  the  most  compact  representation 
of  p  in  the  sense  that  for  any  limit  of  coefficient  order, 
C  will  represent  p  as  well  or  better  than  any  other  set 
of  coefficients,  even  F. 

Perspective  on  this  remarkable  fact  is  provided  by 
practical  consideration  of  the  solution  of  (24).  It  is 
known  that  the  Karle-Hauptman  (1950)  determinants 
approach  singularity  in  the  neighbourhood  of  order 
N,  the  number  of  atoms  in  a  unit  cell  (Goedkoop, 
1950),  and  therefore  the  number  of  non-trivial  ele¬ 
ments  in  C  is  limited  to  being  not  much  larger  than 
N.  Clearly,  if  the  elements  of  a  set  of  structure  factors 
are  exceedingly  more  numerous  than  the  atoms  in  a 
unit  cell,  then  normal  Fourier  synthesis  will  yield  an 
authoritative  representation  of  electron  density  which 
cannot  be  matched  by  cp.  If,  on  the  other  hand,  the 
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number  of  known  structure  factors  is  less  than  N,  cp 
is  a  density  estimate  at  a  resolution  higher  than  that 
provided  by  Fourier  synthesis  of  structure  factors. 
In  connection  with  information  theory  this  pheno¬ 
menon  is  called  superresolution  (Benjamin,  1980). 

These  two  extremes  are  not  representative  of  the 
largest  part  of  crystallographic  problems  either  in  the 
somewhat  abstract  one-dimensional  case  of  this 
section  or  in  three  dimensions.  A  desirable  but  so  far 
unknown  analysis  for  a  representative  real  case  would 
combine  quantitatively  the  effects  of  additional  or 
imposed  information  and  experimental  error  to  give 
an  estimate  of  potential  resolution  enhancement  in  cp. 
In  the  absence  of  such  analysis  the  nature  of  these 
effects  is  nevertheless  clear.  The  two  outstanding  facts 
of  principle  are  that  resolution  in  cp  is  without  limit, 
and  experimental  error  in  any  particular  F  is  distri¬ 
buted  over  all  elements  of  C  in  some  manner  consistent 
with  positive-definite  density.  Both  facts,  it  should  be 
noted,  follow  directly  from  the  functional  constraints 
required  for  discovery  of  unique,  analytic  1  /g  free  of 
zeros  and  poles  on  and  within  the  unit  circle  as 
required  by  wk  prediction  theory. 

While  a  formalism  for  resolution  enhancement  has 
been  presented,  its  measure  has  been  achieved  only 
in  particular  cases  through  study  by  simulation.  A  one¬ 
dimensional  crystallographic  illustration  discussed  in 
detail  elsewhere  (Collins,  1978)  here  provides  an 
empirical  basis  for  expectation  of  resolution  enhance¬ 
ment  by  factors  >  2  in  favourable  cases.  The  artificial 
structure  described  in  Table  1  was  constructed  to 
contain  atoms  separated  by  typical  interatomic 
distances.  The  computations  are  all  based  on  error- 
free  structure  factors  calculated  for  atoms  free  of 
thermal  motion.  Solution  of  equation  (24)  at  order 
n=  14  gave  the  elements  of  C  tabulated  in  Table  2. 
The  corresponding  density  functions,  cp  and  p  based 
upon  Fourier  synthesis  of  structure  factors  with 
indices  in  the  range  0-14,  are  given  in  Fig.  1.  In 
the  figure  cp  is  plotted  above  the  zero  line  and  increas- 
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Table  1.  Atomic  position  parameters  and  interatomic 
distances  for  a  hypothetical  one-dimensional  structure 


Inter¬ 

Inter¬ 

Atom 

X 

atomic 

Atom 

X 

atomic 

distance. 

distance, 

A 

A 

Q 

0-01094 

1-5 

N8 

0-45312 

1-3 

o2 

0-03438 

1-2 

c9 

0-47344 

15-1 

C3 

0-05312 

1-4 

Oio 

0-70938 

1-3 

N4 

0-07500 

1-4 

Cu 

0-72969 

3-8 

Q 

0-09688 

10-1 

N„ 

0-78906 

1-5 

o6 

0-25469 

11-3 

Cl3 

0-81250 

12-7 

c7 

0-43125 

1-4 

Cx 

1-01094 

N8 

0-45312 

Axis  length 

=  64-0 

Table  2.  The  extrapolation-error  filter  derived 

O 

from  data  at  4-6  A  resolution*  Ch=Ah-{-iBh. 


h 

Bh 

0 

1-000 

0-0 

1 

-0-416 

0-146 

2 

0-068 

-0-190 

3 

0-132 

-1-417 

4 

-0-975 

0-449 

5 

0-242 

— 1-155 

6 

-0-598 

0-642 

7 

-0-528 

1-536 

8 

-0-691 

-0-447 

9 

0-624 

0-686 

10 

0-261 

0-168 

11 

0-558 

-0-253 

12 

0-485 

0-155 

13 

-0-246 

0-071 

14 

-0-087 

-0-457 

*For  this  filter. 

the  error-series  variance 

is  <4  =  2-94. 
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Fig.  1.  One-dimensional  electron-density  functions  based  on 
data  at  4-6  A  resolution. 

ing  upward;  p  is  plotted  below  the  line  and  increasing 
downward.  It  is  clear  from  the  formulation  of  cp 
that  it  is  always  positive,  but  p,  having  negative  values, 
is  actually  plotted  as  its  absolute  value  with  negative 
areas  shaded.  The  positions  of  the  13  atoms  are 
marked  by  vertical  hash  marks  across  the  zero  line. 

It  is  clear  that  the  resolution  in  cp  is  greater  than  in 
p.  Itis  to  be  noted  that  the  two  1-3A  separations  have 
been  resolved  but  a  1-2 A  separation  has  not.  The 
rule  of  thumb  which  requires  data  at  a  resolution 
(minimum  interplanar  spacing)  just  equal  to  the 
distance  between  atoms  to  be  resolved,  in  this  case 
implies  the  effective  resolution  in  cp  is  in  the  range 

1- 2— 1-3 A.  The  structure-factor  data  employed  are  at 
a  resolution  of  4-6A  (minimum  interplanar  spacing) 
and  this  simulation  corresponds  to  qualitative  resol¬ 
ution  enhancement  by  a  factor  somewhat  greater  than 
3.  Resolution  of  the  1-2 A  separation  is  not  achieved 
until  structure  factors  with  indices  up  to  32  are  used 
(Collins,  1978).  The  corresponding  data  resolution  is 

2- OA  (minimum  interplanar  spacing)  and  the  quali¬ 
tative  resolution  enhancement  is  thus  reduced  to  some¬ 
what  under  2.  In  this  latter  case  the  order  of  equation 
(24)  is  32  and  is  significantly  greater  than  7V=13. 
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The  corresponding  results  should  not  be  considered 
especially  reliable  in  view  of  the  ill-conditioning 
problem  discussed  earlier.  In  consideration  of  these 
observations  and  the  known  non-linearity  of  relation¬ 
ship  between  superresolution  and  the  extent  of  parent 
data  (Benjamin,  1980),  it  is  reasonable  to  make  the 
following  empirical  assertion.  Wiener  methods  may 
be  employed  in  crystallographic  problems  to  enhance 
resolution  in  density  functions  by  a  factor  >2  under 
favourable  conditions. 

The  structure  factors  used  in  the  illustration  are 
error-free.  The  error-series  variance  is  nevertheless 
not  zero  but  gives  a  measure  of  the  degree  to  which 
C  does  not  represent  the  information  contained  in  F. 
Because  C  is  unitless,  dln  has  the  units  of  F  and  gives 
as  a  number  of  electrons  the  global  mismatch  between 
p,  the  transform  of  experimental  F,  and  cp.  For  n=  1 
there  is  no  structure  in  cp  and  its  mismatch  with  p  is 
exactly  given  by  the  total  number  of  electrons  per 
lattice  point.  In  the  real  case  for  which  structure 
factors  may  be  taken  as  the  sum  of  true  values  and 
(small)  random  numbers,  experimental  p  will  also  be 
the  sum  of  true  values  and  (small)  random  numbers. 
The  necessarily  zero-mean  random  process  cannot 
be  represented  by  positive-definite  cp  and  although  the 
errors  in  F  may  be  distributed  unpredictably  over  cp 
the  value  for  a2  will  certainly  increase  as  ordinary 
experimental  error  is  carried  into  the  values  for  F. 


4.  Helson,  Lowdenslager  and  ^-Dimensions 

A.  Principles.  In  the  preceding  section  discussion 
was  limited  mainly  to  one  dimension.  The  transition  to 
two  or  more  dimensions  is  not  straightforward.  This 
section  and  the  ^-dimension  generalization  of  Wiener 
methods  depend  heavily  upon  the  work  of  Helson  and 
Lowdenslager  (1958).  In  that  paper  the  authors  point 
to  the  underlying  problem  of  n-dimensional  generali¬ 
zation  in  the  statement  that  ‘analytic  function  theory 
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divides  into  two  distinct  disciplines  in  higher  dimen¬ 
sions’.  Here  reference  is  made  to  the  theory  of  functions 
on  the  unit  bicylinder  as  contrasted  to  the  unit  circle. 
The  generalization  problem  has  an  obvious  practical 
manifestation  in  the  loss  of  Toeplitz-form  matrices 
in  the  two-dimensional  form  of  equation  (24).  For 
present  purposes  the  loss  of  Toeplitz  form  is  the  sole 
problem  of  generalization  from  one  dimension  and 
the  two-dimensional  case  therefore  suffices  as  a 
representative  for  any  number  of  dimensions.  For 
the  following  discussion  many  findings  and  statements 
have  been  taken  without  proof  from  Helson  and 
Lowdenslager  (hl,  hereafter)  whose  presentation 
provides  a  full  discussion  of  each  point. 

The  foundational  construction  which  corresponds 
to  the  causality  or  one-sideness  of  one  dimension  is 
the  half-lattice  of  two  (or  more)  dimensions  proposed 
by  hl.  On  a  two-dimensional  primitive  net  whose 
points  are  represented  by  integral  coordinates  ( m,n ), 
S  is  a  half-lattice  if 

a.  (0,0)  does  not  exist  in  S, 

b.  (m,  n)  exists  in  S  if  and  only  if  (— m,~ ri)  does 
not  exist  in  S  unless  m=«= 0. 

c.  (m,  n)  exists  in  5"  and  (m',  n')  exists  in  S  imply 
(m-j-ra',  «+«')  exists  in  S. 

It  may  be  noted  that  this  definition  of  a  half-lattice 
is  suitable  for  any  number  of  dimensions  including 
one. 

The  foundational  theorem  for  n-dimensional  genera¬ 
lization  of  cp  makes  use  of  the  half-lattice  as  a  domain 
throughout  which  Fourier  coefficients  may  be  non¬ 
zero.  The  theorem,  a  slightly  weakened  form  of  hl 
theorem  2,  is: 

Let  p  be  summable  on  the  bicylinder  and  given  by 
the  Fourier  series 

p(eix,eiy)—PmJr  ^  Pmn  exp  {—2-ni(mx+ny)}  (26) 
S 
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where  S  is  any  half-lattice.  Then 


{  In \p  | dxdy  ^tlo-lPoo  |. 


(27) 


The  point  of  immediate  importance  is  that  |p|>0, 
provided  the  half-lattice  condition  is  observed  in  (26), 
and  |  Poo  |>°-  If  [/?  i>°  and  is  given  by  (26),  then  p~x 
is  summable  on  the  unit  bicylinder  and  both  it  and 
its  absolute  value  are  well  behaved  and  non-zero.  A 
formal  two-dimensional  electron-density  estimator 

analogous  to  CPX  may  be  written  as 


where  A  is  the  area  of  the  lattice  period  and,  as  for  the 
one-dimensional  case,  Coo=  1  which  here  also  ensures 

the  finiteness  of  cPxy  by  hl  theorem  2. 

Relationships  necessary  to  evaluate  (28)  are 
readily  developed  along  the  lines  used  earlier  in  the 
one-dimensional  case.  As  before,  the  key  physical 
assertions  are  the  existence  and  availability  of  structure 
factors  and  the  positivity  of  electron  density  expressed 
by  existence  of  non-zero  g  such  that  p=|g|">-0. 
hl  show  that  if  p  (real)  is  non-negative  and  summable 
on  the  bicylinder,  then  for  any  definite  half-lattice 
there  is  a  unique  H  given  by 


S 


such  that 


(30) 


P=(|l+i/|/*)-2 


where  k  is  a  positive  constant.  With  c  =  l+i/  it  is 
evidently  the  case  that 


K 


(31) 


and 


g  c=k  exp  {/<£}, 


(32) 


C.  S.— 19 
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where  <f>  is  undetermined  and  c  has  the  form  given  in 
(26)  with  Coo=l-0.  The  Fourier  transform  of  (32)  may 
be  written  in  matrix  notation  as 

G  C=€  (33) 

and  premultiplication  by  A~]  GT  gives 

-GfGC=FC=/3  (34) 

A 

because  F  is  the  autocorrelation  of  G  as  required  by 
(9).  It  is  interesting  that  if  equation  (33)  is  satisfied  and 
F  is  not  singular,  then 

CfF  C=cr2,  (35) 

for  any  suitably  chosen  elements  for  fi,  not  all  zero, 
as  required  by  (32)  and  the  consequent  random  nature 
of  €.  Analogy  with  the  one-dimensional  case  suggests 
that  except  for  the  lead  element,  all  elements  of  /3  be 
taken  as  zero.  It  is  easily  verified  that  after  multipli¬ 
cative  adjustment  to  make  Cw—  1  the  lead  element 
of  is  cr2,  the  variance  of  €.  Complete  determination 
of  C,  subject  to  specification  of  a  half-lattice,  may  be 
obtained  as  the  solution  of 


^00  Fov-Fon 

1 

cP 

II 

Fw  Ftl 

cx 

= 

o  •• 

I - 

O 

a 

1 _ 

l 

oc 
_ ! 

.  O 

_ 1 

(36) 


in  which  the  subscripts  h*  and  hi  —  have  been  re¬ 
placed  by  i  and  ij  respectively. 

Practical  selection  of  a  half-lattice  by  its  definition 
is  a  little  awkward.  For  crystallographic  application 
the  problem  is  greatly  simplified  by  construction  of  the 
lattice  starting  with  some  reciprocal  lattice  point  h0. 
All  h,  |  h  |  <  |  h0 1  may  then  be  added  to  h0  to  generate 
a  half-lattice  S'.  This  lattice  is  somewhat  more  res¬ 
tricted  than  hl  require,  but  it  satisfies  their  geometry 
and  in  any  case  may  be  allowed  to  increase  without 
bound  by  allowing  h0  to  become  arbitrarily  large. 
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The  great  advantage  of  the  construction  centred 
on  hfl  arises  in  the  following  way.  Consider  that  for 
some  h0  the  set  (h|  —  h£  +  h0)  may  be  listed  in  order  of 
increasing  magnitude  of  and  that  the  corresponding 
elements  of  C  may  be  given  in  the  same  order,  and 
coo  last.  Now  Fu  the  elements  of  F  are  independent  of 
ho  because  h-  —  h)— h*  —  hj.  Furthermore,  in  the 
expression 


^  Ch'  exp  277-  i  h'  •  x}  ~2, 
T' 


(37) 


where  T’  signifies  the  half  lattice  S'  and  its  origin,  all 
h'  may  be  changed  by  an  arbitrary  fixed  vector,  say 
—  ho,  without  changing  the  numerical  value  of  the 
expression.  Because  F  is  unaffected  by  h0  as  is  the  value 
of  (37),  equation  (36)  may  be  set  up  with  h0  =  (0,0) 
then  h*,  |  hf  |  >  |  h^  |;  i  =  1,2, . . .,  n.  It  does  not 
appear  that  omission  of  any  h  need  be  inimical  to 
evaluation  of  Cp  but  it  is  not  clear  that  there  is  an 
advantage  to  any  such  omission  and  it  is  not  consi¬ 
dered  further. 

B.  Applications.  Crystallographic  appplication 
involving  w-dimensional  density  estimation  requires 
evaluation  of  equation  (36).  Density  estimation  then 
follows  after  evaluating. 


c  _ 
Px  — 


_ °VA _ 

12  exp  2tt  /  h  *  xj-  |2 

T 


(38) 


In  this  equation,  which  is  a  modified  form  of  equation 
(28),  T  represents  a  lattice  portion  including  the  origin 
and  points  well  distributed  about  the  origin,  j  h|  <  |t|/2. 
While  the  maximum  of  |h|  is  |t|/2.  the  maximum  of 
|  hi  —  hjl  isjtj;  ft |/2  is  designated ‘estimator  resolution’ 
when  expressed  as  a  minimum  interplanar  spacing, 
similarly  |t|  is  ‘information  resolution.’  It  may  be 
noted  that  in  the  one-dimensional  case  the  true  half¬ 
lattice  is  retained  for  applications  and  estimator 
resolution  is  the  same  as  information  resolution. 
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A  choice  of  n,  the  number  of  elements  in  T  excluding 
the  origin,  must  be  limited  to  that  for  which  the 
implied  information  resolution  lies  within  the  range 
of  available  structure-factor  data.  It  may  be  that  the 
limit  by  data  is  not  absolute,  but  it  represents  the 
actual  limit  of  information  employed  whether  or  not 
n  is  allowed  to  increase  further.  The  number  of  atoms 
in  a  unit  cell  is,  as  in  one  dimension,  an  upper  limit 
for  n  insofar  as  (36)  tends  toward  ill-conditioning  as 
n  increases  beyond  N. 

Simulation  studies  were  conducted  using  the 
two-dimensional  centrosymmetric  projection  of  hexa- 
methylbenzene  reported  by  Brockway  and  Robert¬ 
son  (1939).  After  minor  adjustment  of  the  reported 
coordinates  and  determination  of  an  overall  isotropic 
thermal  parameter  to  be  5=  3-0  A2,  structure  factors 
corresponding  to  the  observations  were  calculated. 
These  calculated  structure  factors  were  used  as  error- 
free  data  for  the  simulation  studies  except  as  noted. 
For  the  structure  model  R  =  0-21. 

Straightforward  evaluation  of  cp  by  equations 
(36)  and  one  equivalent  to  (38)  was  carried  out  for  a 
variety  of  values  for  n.  Although  the  atoms,  all 
carbon,  were  resolved  for  a  smallest  value  of  n  —  52, 
the  qualitatively  similar  map  for  n=  54  was  sub¬ 
jectively  judged  to  be  a  significantly  better  representa¬ 
tion  of  the  structure.  Fig.  2  gives  cp  for  n  —  54. 
Fig.  3  gives  p  on  the  same  scale  as  computed  by 


Fig.  2.  Hexamethylbenzene  projection  given  by  an  order  54 
electron-density  estimator. 


Wiener  Methods  for  Electron  Density  293 


Fig.  3.  Hexamethylbenzene  projection  at  high  resolution  given 
by  Fourier  synthesis  of  error-free  structure  factors. 

normal  Fourier  synthesis  of  the  108  unique  (perfect) 
structure  factors  corresponding  to  the  complete  set 
of  reported  observations.  For  n  —  54,  o l  —  37  and  the 
estimator  resolution  is  1-62  A  or  sin  0/A  =  0-31  A-1; 
the  corresponding  information  resolution  is  0-81  A 
or  sin  0/A  =  0-62  A-1. 

The  resolution  of  the  original  set  of  data  is  0-78  A 
or  sin  0/A  =  0-64  A-1.  Clearly,  cp  as  calculated  for 
n  —  54  does  not  represent  a  more  efficient  use  of 
information  than  does  normal  Fourier  synthesis. 
Figs.  2  and  3  show  that  cp  is  naturally  more  spiky  than 
a  corresponding  Fourier  synthesis  of  the  structure 
factor  data  upon  which  cp  is  based.  This  parallels 
the  common  crystallographic  practice  of  ‘sharpening’ 
density  functions,  whether  in  electron-density  or 
Patterson  maps,  and  suggests  that  unmodified  struc¬ 
ture  factors  may  not  be  the  most  suitable  for  resolution 
enhancement.  Sharpening  of  any  kind  is  achieved  by 
alteration  of  structure  factors  to  remove  some  of  or 
all  their  global  sin  0  fall-off.  Use  of  altered  structure 
factors  in  the  matrix  F  expands  the  range  of  its 
eigenvalues  so  the  smallest  become  smaller  and  the 
determinant  of  F  tends  toward  zero.  This  effect 
provides  the  basis  for  the  following  test. 

Calculations  designed  to  force  maximum  resolution 
enhancement  in  cp  used  calculated  phase  and  reported 
experimental  structure-factor  moduli  (over)  corrected 
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for  thermal  motion  by  exp  {B  sin20/A2)-  in  which 
B  =  6-0.  For  various  values  of  n,  a  cycle  of  solutions 
of  equation  (36)  was  devised  which  would  drive  |  F  | 
toward  singularity.  The  cycle  comprised  solution  of 
(36)  for  a  selected  value  of  n,  complete  eigen-analysis 
of  F  and  subsequent  reduction  of  all  diagonal  ele¬ 
ments  by  the  smallest  eigenvalue  x  3/4  before  a  new 
solution  of  (36).  From  a  series  for  n  =  8,  12,  16,  20, 
24,  cp  for  n  —  20  and  F00  reduced  to  69  was  judged 
to  be  the  function  which  displayed  maximum  effective 
resolution  enhancement.  Each  of  the  six  independent 
atoms  appeared  in  cp  at  the  right  location;  three  were 
given  by  single  peaks,  three  by  double  peaks  and  there 
were  no  other  significant  peaks  in  the  map.  The  map 
is  excessively  spiky  with  its  highest  peak  at  77e/A2, 
its  lowest  at  20  e/A2  and  about  80%  of  the  map 
below  1  e/A2.  In  this  case  °20  =  16  and  the  ratio 
ct2/F ’oo  =  0-23,  a  substantial  improvement  over  the 
earlier  case  for  n  =  54  in  which  —  0-51. 

Resolution  of  the  atoms  in  the  projection  of 
hexamethylbenzene  undoubtedly  could  be  forced  for 
lower  values  of  n  had  the  data  been  more  exact  or  the 
calculations  more  tightly  constrained.  For  n  =  20  the 
estimator  resolution  is  2-75A,  the  information 
resolution  is  1-37  A  and  the  smallest  interatomic 
separation  resolved  is  a  foreshortened  1-06  A. 
Concerning  qualitative  location  of  equal  atoms,  the 
rule  of  thumb  mentioned  in  discussion  of  the  one¬ 
dimensional  case  leads  to  an  expectation  of  potential 
n-dimensional  resolution  enhancement  of  at  least 
1-37  A/1-06  A  =  1-3  provided  the  structure-factor 
moduli  are  in  error  by  less  than  20%. 


5.  Connections  with  information  theory 

Resolution  enhancement  as  a  specific  goal  in  the 
development  and  application  of  Wiener  methods  for 
electron  density  has  a  clear  conceptual  parallel  with 
Shannon  &  Weaver’s  (1949)  information  theory. 
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As  presented  by  Shannon,  information  theory  was 
concerned  with  the  efficiency  of  encoding  and  trans¬ 
mitting  information.  The  crystallographic  problem  of 
resolution  enhancement  is  similar  but  has  a  different 
emphasis  in  that  it  is  concerned  with  extracting  a 
maximum  of  information  from  an  existing  set  of 
structure  factors.  The  underlying  unity  of  Wiener 
methods  and  information  theory  is  readily  developed 
along  two  distinct  lines. 

A.  Filters.  Shannon’s  Theorem  14  (Shannon  & 
Weaver,  1949)  requires  that  a  signal  passed  through 
a  linear  filter  with  Fourier  transform  X(f)  undergo  a 
gain  in  entropy  proportional  to 

(39) 

where  W  indicates  the  frequency  band  to  which  com¬ 
ponents  of  X(f)  are  limited.  If  an  uncorrelated  or 
white  signal  is  passed  through  the  filter  then  its 
entropy  gain  is  given  by  (39),  within  a  constant.  A 
maximum  of  entropy  gain  may  be  sought  for  the 
filtering  process,  subject  to  suitable  constraints,  and 
the  resulting  |  X(f)  |2  is  the  maximum  entropy  spectral 
estimate  for  the  time  series  x  (Abies,  1974;  Robinson 
&  Treitel,  1980).  The  constraints  to  be  imposed  are 
in  the  form  of  known  6,  the  autocorrelation  of  x, 

J  w  <&(/)  exp  {— 2t7  ifk}  <f>k;  —  m  <  k  <m,(40) 
and  the  expression  to  be  maximized  is 


m 

jw<  too  (/)  -2  A»[0(/) 

k=  —  m 


exp  {-  2W  ifk}  -  <f>K 1  >  df 


(41) 


in  which  Lagrange  multipliers  y  have  been  intro 
duced. 

Maximization  of  (41)  results  in 


$(/) 


1 

m 

A,(  exp  2 77  i  f  kf 

k=  —  m 


(42) 
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It  is  clear  that  <&(/)  has  been  required  to  be  positive 
definite  and  that  this  equation  which  gives  a  form  for 
the  maximum-entropy  estimate  of  <&(/)  or  \X(f)\2 
therefore  may  be  written  as 


<IL 


Yt  exp  {—  2-nitf } 

t=0 


.  (43) 


Of  course  the  foregoing  right-hand  expression  has 
been  arranged  to  correspond  to  equation  (20)  and  to 
show  that  cpx  in  (25)  is  a  maximum-entropy  electron- 
density  estimate.  In  regard  to  cp <  ,  constraints  given 
by  (40)  are  the  known  structure  factors. 

B.  Density  maps.  Gull  and  Daniell  (1978)  have 
proposed  a  maximum-entropy  electron-density  esti¬ 
mate  based  upon  constrained  maximization  of 

^  Px  In  Px’  (44) 

X 

the  configuration  entropy  of  the  density  function.  As 
in  the  preceding  case  the  constraints  are  the  known 
structure  factors.  Gull  and  Daniell’s  formula  for  p  is 


Px  =  exp 


-  1  +  A 


-  n" 


exp  2t nkxy^p. 


(45) 


in  which  A  is  a  positive  constant,  K  represents  the  set 
of  observations,  FW)  the  observed  structure-factor 
data  with  presumed  error-free  phases,  <rfc  the  standard 
error  for  Fi0),  and  F(c)  the  Fourier  transform  of  the 
most  recent  iterate  of  px. 

The  formulations  of  density  based  upon  maximizing 
of  either  F  In  p  or  —  F  p  In  p  are  obviously  different.  It 
should  be  observed,  however,  that  in  both  cases  the 
same  structure-factor  data  would  be  used  as  cons¬ 
traints  and  in  both  cases  p  is  required  to  be  positive 
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definite.  This  latter  requirement,  in  principle,  demands 
a  unique  positive-definite  estimate  of  p  for  fixed 
structure  moduli  and  associated  phases.  In  principle, 
use  of  either  S  In  P  or  -  2  P  In  P  ought  to  yield 
indentical  results  except  for  the  numerical  effects 
of  the  manner  in  which  structure  factors  enter  the 
computations. 

The  differences  between  (25)  and  (45)  are  most 
substantial.  Nevertheless,  two  particular  similarities 
are  important  to  crystallographic  application.  Both 
formulations  for  p  provide  a  means  of  distributing 
error  in  a  structure  factor  over  the  entire  set,  and  both 
can  yield  true  superresolution  or  resolution  enhance¬ 
ment  in  a  maximum-entropy  electron-density  estimate. 


6.  Conclusions 


Wiener  methods  bring  analytic-function  theory  to 
bear  on  computations  involving  electron  density. 
Application  of  the  methods  turns  upon  the  ideal 
positive-definite  character  of  density  functions  and 
involves  the  determination  of  their  analytic  square 
roots.  It  is  assumed  that  structure  factors  are  available 
and  free  of  such  systematic  errors  as  anomalous 
dispersion.  Determination  of  an  analytic  root  of 
electron  density  can  also  be  considered  deconvolution 
of  structure  factors  and  although  Wiener  methods 
require  both,  the  operations  need  not  be  explicit. 

Actual  calculations  emphasize  the  analytic-root 
aspect  of  the  methods  inasmuch  as  CP  is  given  as 
proportional  to  the  squared  modulus  of  an  analytic 
function.  Resolution  enhancement,  the  initial  goal  of 
this  work,  is  more  readily  related  to  the  deconvolution 
operation  in  reciprocal  space.  Inasmuch  as  a  root  of 
CP  has  a  transform  not  limited  in  resolution  by  the 
experimental  range  of  structure-factor  data,  the 
implied  convolution  whose  transform  is  CP  is  also 
potentially  of  unlimited  extent  in  reciprocal  space. 
Because  structure-factor  deconvolution  is  only  implied, 
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resolution-enhancement  factors  have  been  estimated 
by  comparison  of  density  functions.  It  seems  clear 
that  the  Wiener  methods  are  most  successful  in  one 
dimension  for  which  resolution-enhancement  factors 
of  >2  may  be  expected  for  the  qualitative  resolution 
of  individual  atom  profiles  in  an  electron-density 
function. 

In  two  dimensions,  which  illustrates  n-dimensional 
generalization,  resolution-enhancement  factors  up  to 
1-3  were  found  by  empirical  testing.  The  highest 
resolution  factors  were  found,  however,  by  forcing  a 
matrix  of  structure  factors  toward  singularity. 
Although  the  results  clearly  show  resolution  enhance¬ 
ment,  the  techniques  used  to  maximize  resolution  are 
not  suitable  to  general  use.  In  ^-dimensions  it  appears 
that  inherently  iterative  schemes  will  be  required  for 
routine  exploitation  of  resolution  enhancement  by 
Wiener  methods. 

The  resolution  enhancement  of  Wiener  methods  is 
the  same  phenomenon  as  the  superresolution  which 
accompanies  formation  of  maximum-entropy  images 
by  information-theory  methods.  This  is  required  by 
the  form  of  CP,  which  is  exactly  that  for  a  maximum- 
entropy  estimate  of  p,  P  treated  as  the  squared  modulus 
of  the  transform  of  a  linear  filter. 

This  work  has  been  supported  in  part  by  the 
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SUBJECT  INDEX 


Note.  Certain  frequently  occurring  expressions,  mainly  statis¬ 
tical,  with  or  without  qualifying  adjectives,  are  not  exhaustively 
indexed.  These  include  Atomic  scattering  factor,  Correlation, 
Covariance,  Deviation,  Distribution,  Mean  Parameter,  Space 
group.  Statistics,  Structure  factor.  Variance,  and  some  others. 

Absolute  configuration,  133,  134 
Absolute  intensities,  from  relative,  1-2 
Absorption,  as  source  of  bias,  234 
edge,  134 

Acentric  distribution,  5-8,  25,  73-74,  102,  123-125,  142 
variance  of,  8 
see  also  Space  group  PI 
N-Acetyl-allo-hydroxy-L-proIine  lactone,  215 
Alternatives  to  least  squares,  3,  4,  225-226,  229-262,  273-298 
269-272 

see  also  Least  squares 
Alternatives  to  R  tests,  19,  46,  48,  187-193 
Ammonium  hydrogen  malate,  203,  220-223 
Anomalous  dispersion,  Anomalous  scattering,  133-172 
see  also  Dispersion 
Argand  diagram,  137 
Arvesen’s  test,  187,  189-193 
Atomic  heterogeneity,  see  Heavy  atom 
Atomic  scattering  factor,  301  (note) 
affected  by  dispersion,  135-136 
Autocorrelation,  276  et  seq. 


Background,  important  in  determining  accuracy  of  intensity 
estimation,  33-37,  181-182 
Bayes’  theorem,  10,  22-23,  213 
Bayesian  statistics,  3,  10,  19-51 
Bessel  function,  modified,  12,  103,  123,  201,  213 
Bias,  in  density  estimation,  182,  226 
in  parameter  estimation,  4,  225,  230,  254,  256 
Bicentric  distribution,  127-128 
Bijvoet  differences,  Bijvoet  pairs,  29,  133-171 
Bijvoet  ratio,  defined,  138 
Biweight  function,  237,  240,  242,  245 

Cauchy  distribution,  109-110,  237-238 
contamination  of  normal,  190,  193 
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Central-limit  theorem 

independent  variables,  2,  57-58,  60,  84,  102-103 
non-independent  variables,  176 
Centric  distribution,  8-9,  25,  73-74, 102, 1 19-123 
variance  of,  9 
see  also  Space  group  PI 
Centroid  method  of  locating  reflexion,  267 
contrasted  with  profile  fitting,  40-44 
Chi-square  agreement  statistic,  256 
Communication  engineering,  274 
Computer  programs,  INSTAT,  95 
mean  values  of  powers  of  structure  factors,  71-73 
MULTAN,  95 
NORMAL,  95 
RFINE  4,  245,  247,  248 

Computer  simulation,  of  error  distributions,  190-193 
of  intensity  distributions,  53,  55-56 
of  non-ideal  distributions,  58-66 
of  R2  distributions,  207-208 
Constraints,  refinement  under,  19,  44-49 
Convergence,  improvement  of,  248,  261 
Correlation  coefficient,  13-14,  301  (note) 
autocorrelation,  276  et  seq. 

Correlation  of  atomic  positions,  4,  175-177 
of  parameters,  230 

Counting  statistics,  9-10, 24,  39,  301  (note) 
as  recorded,  179-185 

assumed  to  have  a  normal  distribution,  26-27 
Covariance,  13,  188,  268,  301  (note) 

Crystallographic  statistics,  301  (note) 

Bayesian,  10-11,  19-51 

general  review,  5-17,  53-55,  83-92 

heavy-atom  dependence,  53-81,  105-110,  117-131 

non-ideal,  53-81,  83-97,  175-177 

origin  of,  1-2 

space-group  dependence,  53-96,  155-157 
Cumulants,  83,  86-92,  93-94 

Data  reduction,  267-268 

Deformation  density.  Deformation  potential,  179 
Dimers,  test  of  symmetry  of,  46 
cis,  cis- 4,  6-Dimethyl  trimethylenesulphite,  205 
Direct  methods,  see  Structure  determination 
Discrepancy  index  (R  factor),  14,  19,  46,  48,  59,  60,  63, 195-223, 
232 

moments  of,  209-219 
Discriminator  criterion,  195-223 


Subject  Index  303 


Disordered  structures,  2 

Dispersion  (including  anomalous  dispersion,  scattering),  3,  16, 
29,  56,  69,  85,  133-172 


Edgeworth  expansion,  53,  56,  78-81,  93,  119-125 
and  non-independence,  175-177 
convergence  of,  75-80 

in  phase  determination,  99,  101,  104,  107,  110,  112-113 
Electron  density 
by  dispersion,  1 5 

by  Fourier  synthesis,  15,  179,  225-226,  265-266 
by  heavy-atom  method,  16 
maximum-entropy  estimate,  296-297 
reduction  of  error  in,  267-268 
Wiener  methods  for,  3,  273-298 
L-Ephedrine  hydrochloride,  170 
Equivalent  reflexions,  233  et  seq. 

Errors,  analysis  of,  267-268 
distribution  function  for,  190,  193,  226,  230 
of  model,  30-32,  259 
of  observation,  32,  39,  270 
random  (instrumental  instability),  40,  184-185 
statistical  fluctuation,  9-10,  15,  179-185,  265-266 
systematic,  15,  180,  244,  255-262 
Excess,  Kurtosis,  37,  215 
Expected  value,  27,  34,  38,  301  (note) 
see  also  Intensity  of  reflexion,  mean  value  of 
Extinction,  245-261  passim 
Extinction  parameter,  defined,  245 
Extrapolation,  295,  296 


Filters,  in  extrapolation  of  series,  279-287 
in  information  theory,  295-296 
Fixed-count  timing,  9 
Fixed-time  counting,  9 
Forsterite,  182-184 

Fourier  transforms,  as  basis  of  Wiener  methods,  274  et  seq. 
Frequentist  statistics,  contrasted  with  Bayesian,  19-20 
FriedePs  law,  134 
see  also  Dispersion 


Gamma  distribution,  190 
function,  incomplete,  107 
Gaussian  distribution,  see  Normal  distribution 
Generalized  distributions,  53-132 
‘  Globs  ’  (Harker),  176 
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Gram-Charlier  expansions,  53-54,  56,  78-81,  93,  119-125 
and  non-independence,  175-177 
convergence  of,  75-80 

in  phase  determination,  99-101,  103-104,  110 

Hamilton’s  R  test,  see  R  test 
Heavy  atom,  and  dispersion,  133 
effect  on  intensity  distributions,  55-66,  92,  105-110,  117-131 
Heavy-atom  analysis,  195 

Heavy-atom  method,  see  Structure  determination 
Hermite  polynomials,  68,  73-74,  75,  89,  120 
Hexamethylbenzene,  292-294 
Histograms,  of  R.>  distributions,  220 
of  structure-factor  distributions,  60-62,  76 
stem-and-leaf  display,  257-259 
Hydrogen  parameters,  245-262  passim,  301  (note) 

Hypercentric  distribution,  126-128,  131 
Hypergeometric  function,  214 
Hypersymmetry,  92,  131 
Hypothesis  testing,  44-49,  187-193 

Information  theory,  294-297 
INSTAT,  95 

Instrument  instability,  39,  40,  41,  185 
Instrumental  variance,  268 
Insulin,  40—42 

Intensity  of  reflexion,  301  (note) 
effect  of  correlation  of  atomic  positions  on,  4,  175-177 
mean  value  of,  2,  8,  24,  27,  176-177 
measurement  of,  33—44,  179-185 
model  for,  33-36 

probability  distribution  of  large  intensities,  4 
of  small  and  moderate  intensities,  2-3,  5-131;  see  more 
specific  entries  (acentric,  bicentric,  centric,  etc.) 
symmetry  dependence  of,  54,  60-66,  70-77,  84-96,  117-131 
variance  of,  9,  27,  180-182 
Intensity  statistics,  301  (note) 
and  non-independence,  175-176 
of  recorded  counts,  179-185 
reviews,  5-17,  53-97 
Interpolation,  273,  274 
Isomorphism,  11-14,  15-16 

Jacknife  test,  187-193 
Jacobian,  28 

Karle-Hauptman  determinants,  273,  283 
Kurtosis,  Excess,  37,  215 
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Laguerre  polynomials,  68,  73-74,  75 
Least-squares  refinement,  15,  27,  225-226,  269-272 
see  also  Alternatives  to  least  squares 
Likelihood,  23,  31,  225-226 
Lindeberg-Levy  central-limit  theorem,  57,  60 
L-Lysine  hydrochloride  dihydrate,  170 

Macromolecular  crystals,  150-151,  273 
see  also  Protein  crystallography 
MAD,  defined,  240 
Markov  chain,  114 

Matrix  equations,  solution  of,  281-282 
Maximum-entropy  estimates,  295-297 
Maximum  likelihood,  225-226 
Mean  intensity,  see  Intensity  of  reflexion 
Measurability  of  Bijvoet  differences,  133-172 
Measurement  of  intensity,  9-11,  33—44,  179-185 
Median  absolute  deviation,  240-241 
Minimum-delay  operator,  282-283 
Model,  20,  301  (note) 

Bayesian  three-stage,  19,  29-33 
structural,  14,  46,  196 

Modified  Bessel  function,  12,  103,  123,  201,  213 
Monochromatization,  182 
MULTAN,  95 

Multiple-counter  techniques,  42-43 

‘  Negative  ’  intensities,  10,  24-25,  182 
Non-crystallographic  symmetry,  46,  126-128,  131 
neglect  of,  56 

partial  centrosymmetry,  133,  158-164 
Non-ideal  distributions,  53-132 
NORMAL  (program),  95 

Normal  distribution,  Gaussian  distribution,  99,  101-112,  265 
301  (note) 

approximation  to  distribution  of  Rz,  211,  215 
in  significance  testing,  190-193 
inadequate  for  large  deviations,  226,  230,  254 
of  measurements  of  an  intensity,  26-27 
of  structure  factors  of  a  centrosymmetric  crystal,  8-9 
see  also  Centric  distribution 
Normalized  Bijvoet  difference,  defined,  137 
Normalized  structure  factor,  89,  100-101,  113,  138,  301  (note) 

Odd  moments,  vanishing  of,  88 
Ordinate  analysis  of  intensity  measurement, 
contrasted  with  profile  fitting,  40-43 
C.  S.— 20 
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Orthogonal  polynomials,  67-68,  175-177 
see  also  Hermite  polynomials,  Laguerre  polynomials 

Parameter  estimation,  15,  20,  21,  25,  31,  36,  301  (note) 
bias  in,  4,  225,  230,  254,  256 
Partial  symmetry,  96,  133,  139,  158-164 
Patterson  coefficient,  275 
Patterson  function,  15,  114,  164,  225-226 
Peak  position,  36,  267 
Peak  shape,  model  for,  34-36 
Peak-to-background  ratio,  180-182,  295 
Peak  width,  36,  267 
Phase  determination,  131 
errors  in,  16 

from  Bijvoet  differences,  133-172 
Probability  of  validity,  99-114 
Phase  shift  on  scattering,  134 
Pitman’s  test,  187,  190-193 
Point  groups,  72,  124-125 

Poisson  distribution,  approximated  by  normal,  39 
variance  of,  268 

Posterior  distribution,  10,  23,  27,  28-29  (figure), 

Prealbumin,  43-45 

Prediction  of  stationary  series,  276  et  seq. 

Prior  distribution,  10,  22,  26,  28-29  (figure) 

Profile  fitting,  19,  33-44 

Protein  crystallography,  14,  29,  33,  40-46  passim,  58,  150-151, 
176,  273 

Pseudo-observations,  defined,  44 

Pseudosymmetry,  see  non-crystallographic  symmetry,  Partial 
symmetry 
Pyrene,  127 

R  factor,  see  Discrepancy  index 

R  test,  19,  46,  48,  187-193 

R2,  moments  of,  209-219 

Rayleigh  distribution,  101,  110,  113-114 

Recorded  counts,  statistics  of,  10,  33-45,  179-195 

Residual,  see  Discrepancy  index 

Resistant,  see  Robust,  Robustness 

Resolution  enhancement,  273-298  passim,  specifically  286-287, 
294,  297-298 

Restrained  refinement,  44-46 
Robust/Resistant  techniques,  229-262 
Robustness,  193,  226,  229-262 
Rotation  search,  195,  219-223 
Rubidium  di-o-nitrobenzoate,  125 
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Scale  factor,  225,  245,  246,  268 
Scientific  method,  24,  47-48,  269-272 
Secondary  extinction,  245 
■Secondary  minima,  269-27 2 
see  also  Series  termination 
Semi-invariants  (Cumulants),  83,  86-92,  93-94 
Seminvariants  (Structure  semin variants),  117,  118,  128-131 
Series  termination,  265-266,  273-298 
Single  Crystal  Intensity  Project,  229-231,  243-262 
Skewness,  37,  211-212,  215,  216 
Smoothing,  265,  273 
Soft  constraints,  19,  44-49 
Space  groups,  301  (note) 

Fddl,  Fddd,  72,  94 

of  higher  symmetry,  26,  69-75,  88,  94*  155-157, 

PI,  2,  5-8,  25,  80, 117,  139, 142-157, 197-207 
PT,  8-9,  26,57,  59-61,63-66,76-77,  117,  197-207 
P2„  245 

Pmtnm,  57,  60,  62,  77 
Spectral  estimation,  278 
Standard  deviation,  see  Variance 
Standardized  moments,  93-94 
Stationary  series,  273,  274 
Stem-and-leaf  display,  257-259 
Stereochemistry,  effect  on  intensity  statistics,  175-177 
Structural  isomorphism,  11-14,  15-16 
Structure  determination,  301  (note) 
bias  in,  3,  182 

direct  methods,  3,  16,  113,  131,  195 
dispersion,  16,  133-172 
heavy-atom  method,  195 
isomorphous  replacement,  11-17 
least  squares,  15,  269-272 
robust/resistant  technique,  229-262 
tangent  formula,  16-17 
use  of  residual  Rt,  195-223 
see  also  Electron  density,  Parameter  estimation 
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